2020 Speakers

IEEE Infrastructure 2020 Keynote Speakers

Katie Antypas

National Energy Research Scientific Computing Center (NERSC) at Berkeley Lab, Division Deputy and Department Head

Introducing the Perlmutter Supercomputer at Lawrence Berkeley National Laboratory

Edward Chang

HTC/DeepQ Healthcare, President
Stanford University, Adjunct Professor

Scarcity, Diversity, and Privacy of Data in Artificial Intelligence for Precision Medicine

Paolo Faraboschi

Hewlett Packard Labs, Fellow and VP

Infrastructure for Edge-to-Cloud Machine Learning (and vice versa)

Andrea Goldsmith

Princeton University, Dean of Engineering and Applied Science

Diversity & Inclusion in Engineering: It's About Success

Silvio Savarese

Stanford University, Associate Professor of Computer Science

Towards the AI-Driven Revolution: Benefits and Risks

Raymie Stata

Airbnb, Technical Fellow

Agility with Stability

Martina Trucco

Micron Technology, Inc., Vice President

Trust During Turbulent Times: How to Use Strategic Communications and Radical Transparency to Engage Your Employees and Transform Your Organization

Speakers

Manoj Agarwal

Salesforce, Software Architect

Serving Very Large Numbers of Low Latency AutoML Models

Dhruba Borthakur

Rockset, CTO and Co-Founder

The Changing Face of Data Analytics from Batch Analytics to Analytics-on-the-fly

Anusha Dayananda

Shutterstock, Software Development Manager

How Shutterstock Enhanced Stability and Reliability for Message-Driven Applications

Nova Fallen

Google, Software Engineer

Privacy-Preserving Machine Learning

Hamza Farooq

Walmart Labs, Principal Machine Learning Engineer

Going Beyond Big Data: Taking ML to the Next Level

Aloke Guha

OpsCruise, CTO

Model Based Control for Microservices Applications

Shesha Krishnapura

Intel, Fellow and Chief Technology Officer of Information Technology

At Scale Green Computing Innovations for Datacenter Transformation

Julien Le Dem

Datakin, CTO and Co-founder

Data Platform Architecture Principles

We’re well into the Big Data era. Most organizations embraced collecting data and analyzing what’s happening inside their products. This is crucial to their success, not only to understand what works but also to optimize their services and increase their value to their customers. Several industries are being disrupted just by using technology to optimize existing processes. Think about cabs, short term rentals, or co-working spaces.

There’s also discussion around what constitutes “big” data. Here, we’re not only talking about large volumes of data produced by the likes of Google, Facebook and others very large companies. We’re also talking about the multitude of various datasources and the many teams using them and producing derived datasets. The concept of central data team that does all the data related work is outdated and the entire organization should become an ecosystem where teams depend on each other. Central data teams now become enablers, coaching and providing a safe and flexible environment to move fast while bringing transparency to the increasing complexity of interdependent system. Data processing and micro services have similar requirements in terms of ownership, monitoring and dependency management.

In this talk we will discuss the principles to follow while building the data platform enabling the entire organization to build data driven products, whether using insights from the data or using data directly to build features (for example recommendations).

Every team can consume and produce data using explicit contracts: What they share or don’t, the level of service they provide and the quality of the data. We need to build visibility to the entire org and help evolve the dependency graph with global lineage and schema evolution.

The platform is self-service and gets out of the way to empower users to do the right thing. It provides a safe environment where mistakes can be easily mitigated and the scope of their impact limited. It is flexible to allow users to pick the best tool for the job while facilitating interdependencies. Streaming and batch processing are complementary and work together. Governance is delegated to the appropriate stewards. Sensitive data is properly annotated and secured and its usage tracked and controlled. Cloud being omnipresent, users expect not to have to worry about where the processes are run or where the data is stored. The platform is expected to scale transparently and be billed by the minute.

We’ll discuss the best tools making the data platform and how to build the missing pieces.

Pageviews: 4631

2020 Speakers

IEEE Infrastructure 2020 Keynote Speakers

Katie Antypas

Edward Chang

Paolo Faraboschi

Andrea Goldsmith

Silvio Savarese

Raymie Stata

Martina Trucco

Speakers

Manoj Agarwal

Dhruba Borthakur

Anusha Dayananda

Nova Fallen

Hamza Farooq

Aloke Guha

Shesha Krishnapura

Julien Le Dem

Joseph Lynch

Chengappa M. R.

Cody Rioux

Christopher Wolff

Lohit Vijayarenu

Zhenzao Wang

Faith Xu

2020 Speakers

IEEE Infrastructure 2020 Keynote Speakers

Katie Antypas

Introducing the Perlmutter Supercomputer at Lawrence Berkeley National Laboratory

Edward Chang

Scarcity, Diversity, and Privacy of Data in Artificial Intelligence for Precision Medicine

Paolo Faraboschi

Infrastructure for Edge-to-Cloud Machine Learning (and vice versa)

Andrea Goldsmith

Andrea Goldsmith

Diversity & Inclusion in Engineering: It's About Success

Silvio Savarese

Towards the AI-Driven Revolution: Benefits and Risks

Raymie Stata

Agility with Stability

Martina Trucco

Trust During Turbulent Times: How to Use Strategic Communications and Radical Transparency to Engage Your Employees and Transform Your Organization

Speakers

Manoj Agarwal

Serving Very Large Numbers of Low Latency AutoML Models

Dhruba Borthakur

The Changing Face of Data Analytics from Batch Analytics to Analytics-on-the-fly

Anusha Dayananda

How Shutterstock Enhanced Stability and Reliability for Message-Driven Applications

Nova Fallen

Privacy-Preserving Machine Learning

Hamza Farooq

Going Beyond Big Data: Taking ML to the Next Level

Aloke Guha

Model Based Control for Microservices Applications

Shesha Krishnapura

At Scale Green Computing Innovations for Datacenter Transformation

Julien Le Dem

Data Platform Architecture Principles

Joseph Lynch

Towards Practical Self-Healing Distributed Databases

Chengappa M. R.

Open Distributed Infrastructure Management - ODIM

Cody Rioux

Mantis Query Language: On-Demand Real-Time Intro

Christopher Wolf

Share File Updating for Test Farms Using a File Cache

Lohit Vijayarenu

Lessons from Scaling HDFS for Exabyte Storage at Twitter

Zhenzao Wang

Scaling Event Aggregation at Twitter to Handle Billions of Events per Minute

Faith Xu

Faster Scalable ML Model Deployment Using ONNX and Open Source Tools