Auto AI and Machine Learning Systems Design

Neil D. Lawrence

UK Systems Research, Virtual Seminar Series

Supply Chain Optimization

Llew Mason Devesh Mishra

Supply Chain Optimization

Llew Mason Devesh Mishra

Forecasting

Jenny Freshwater Ping Xu Dean Foster

Inventory and Buying

Deepak Bhatia Piyush Saraogi Raman Iyer Salal Humair Narayan Venkatasubramanyan

The Mythical Man-month

Separation of Concerns

Intellectual Debt

  • Technical debt is the inability to maintain your complex software system.

  • Intellectual debt is the inability to explain your software system.

FIT Models to FIT Systems

  • Focus in machine learning has been on FAcT learning.
  • Fairness, accountability and Transparency in individual models.
  • But individual models aren’t the problem.
  • Fariness, interpetability and transparency required for whole system.

Service Oriented Architecture

Charlie Bell Peter Vosshall

Service Oriented Architecture

Charlie Bell Peter Vosshall

Data Oriented Architectures

  • Convert data to a first-class citizen.
  • View system as operations on data streams.
  • Expose data operations in a programmatic way.

Data Orientated Architectures

  • Historically we’ve been software first
    • A necessary but not sufficient condition for data first
  • Move from
    1. service orientated architectures
    2. data orientated architectures

Streaming System

  • Move from pull updates to push updates.
  • Operate on rows rather than columns.
  • Lead to stateless logic: persistence handled by system.
  • Example Apache Kafka + Apache Flink

Streaming Architectures

  • AWS Kinesis, Apache Kafka
  • Not just about streaming
    • Nodes in the architecture are stateless
    • They persist through storing state on streams
  • This brings the data inside out

Milan

  1. A general-purpose stream algebra that encodes relationships between data streams (the Milan Intermediate Language or Milan IL)

  2. A Scala library for building programs in that algebra.

  3. A compiler that takes programs expressed in Milan IL and produces a Flink application that executes the program.

Tom Borchert

Milan has three components:

  1. A general-purpose stream algebra that encodes relationships between data streams (the Milan Intermediate Language or Milan IL)

  2. A Scala library for building programs in that algebra.

  3. A compiler that takes programs expressed in Milan IL and produces a Flink application that executes the program.

Component (2) can be extended to support interfaces in additional languages, and component (3) can be extended to support additional runtime targets. Considering just the multiple interfaces and the multiple runtimes, Milan looks a lot like the much more mature Apache Beam. The difference lies in (1), Milan’s general-purpose stream algebra. }

Meta Modelling

Conclusion

  • Challenges in decomposition, data and model deployment for ML.
  • Data oriented architectures and data first thinking are the solution.
  • Data oriented programming creates systems that are ready to deploy.
  • Opens the door to AutoAI and information dynamics analysis.

Thanks!

References

Brooks, F., n.d. The mythical man-month. Addison-Wesley.