Big Data

Meet Luis Ceze, a 2024 BigDATA Wire Individual to Watch

24 September 2024

Luis Ceze is many issues: He’s the CEO and co-founder of OctoAI, an Lazowska Endowed Professor at College of Washington, a co-founder of the Apache TVM mission, and likewise a 2024 BigDATA Wire Individual to Watch.

We lately caught up with Ceze to ask him a couple of questions on his many endeavors. Here’s what he stated:

BigDATA Wire: You modified the title of your organization from OctoML to OctoAI in January. Are you able to elaborate on the change?

Luis Ceze: We modified our title from OctoML to OctoAI to higher replicate the enlargement and evolution of our product suite, which extra broadly addresses the rising market wants within the generative AI area.

Within the final 12 months, we considerably expanded our platform for builders to construct manufacturing purposes with generative AI fashions. This implies corporations can run any mannequin of their selection— whether or not off-the-shelf, customized or open-source— and deploy them on-prem inside their very own environments or within the cloud.

Our newest providing is OctoStack, a turn-key manufacturing platform that delivers highly-optimized inference, mannequin customization and asset administration at scale for giant enterprises. This provides corporations whole AI autonomy when constructing and operating Generative AI purposes immediately inside their very own environments.

We have already got dozens of high-growth generative AI prospects—like Apate.ai, Otherside AI, Latitude Video games, and Capitol AI utilizing the platform to seamlessly transport this extremely dependable, customizable, environment friendly infrastructure immediately into their very own atmosphere. These corporations are actually firmly answerable for how and the place they work with fashions and profit from our maintenance-free serving stack.

BDW: You’re a co-founder of the Apache TVM mission, which permits machine studying fashions to be optimized and compiled to completely different {hardware}. However GPUs are all the fad. Ought to we be extra open to operating ML fashions on different {hardware}?

Ceze: We’ve skilled extra AI innovation the final 18 months than ever earlier than. From at some point to the following, AI has shifted from the lab to a viable enterprise driver. It’s clear that for AI to scale, we’d like to have the ability to run it on a broad vary of {hardware} from data-centers to edge and cellular units.

However we’re at a juncture that’s harking back to the cloud days. Again then corporations needed the liberty to host knowledge throughout multiple cloud, or a mixture of cloud and on-premise.

As we speak corporations additionally need accessibility and selection when constructing with AI. They need the selection to run any mannequin, be it customized, proprietary or open supply. They need the liberty to run stated fashions on any cloud or native endpoint, with out handcuffs.

This was our mission with Apache TVM early on, and this has carried on via my work at OctoAI. OctoAI SaaS and OctoStack are designed with the precept of {hardware} independence and portability to completely different buyer environments.

BDW: GenAI goes from a interval of experimentation in 2023 to deployment in 2024. What are the keys to creating LLMs extra impactful for companies?

Ceze: We strongly consider that 2024 is the 12 months that generative AI makes it out of growth and into manufacturing. However to convey this to fruition, corporations are going to should concentrate on a couple of key issues.

The primary is controlling price so the unit economics of LLMs work in your favor. Mannequin coaching is a predictable expense, however inference (calling a mannequin operating in manufacturing) can get very costly, particularly if utilization surges past what you’ve deliberate for.

Second is deciding on the precise mannequin on your use case. It’s getting more difficult due to the sheer variety of LLMs to choose from (there are 80,000 and counting) and mannequin fatigue is starting to set in. Discovering one that’s highly effective sufficient to ship the standard you want and runs effectively as to be cost-effective – that’s the stability you need to strike.

Third, methods like fine-tuning are extremely necessary to assist customise these LLMs for distinctive performance. One development we observe is that LLMs themselves are more and more commodified, and the true worth comes from customization to satisfy a selected, high-value use case.

BDW: Outdoors of the skilled sphere, what are you able to share about your self that your colleagues is likely to be stunned to study – any distinctive hobbies or tales?

Ceze: Meals for me is greater than diet :). I like to study meals; I like to cook dinner it; I like to eat it.

I like to grasp meals “cross-stack”, from cultural facets all the way down to chemistry. After which consuming / consuming ;).

One other enjoyable bit: a few of my analysis was in DNA knowledge storage, and my work lately traveled to the moon!

You’ll be able to learn extra in regards to the 2024 BigDATA Wire Folks to Watch right here.

BigDATA Wire: You modified the title of your organization from OctoML to OctoAI in January. Are you able to elaborate on the change?

BDW: You’re a co-founder of the Apache TVM mission, which permits machine studying fashions to be optimized and compiled to completely different {hardware}. However GPUs are all the fad. Ought to we be extra open to operating ML fashions on different {hardware}?

BDW: GenAI goes from a interval of experimentation in 2023 to deployment in 2024. What are the keys to creating LLMs extra impactful for companies?

BDW: Outdoors of the skilled sphere, what are you able to share about your self that your colleagues is likely to be stunned to study – any distinctive hobbies or tales?

LEAVE A REPLY Cancel reply