Have you ever wondered what the future of AI will look like? How will it change our lives, our work, our society? If you are like me, you are probably fascinated by the rapid developments and innovations in this field. In this post, I want to share with you some of my thoughts and observations on the current state and trends of AI, based on my personal experience and research.
The last 18 months have been absolutely fascinating. We have witnessed the release of ChatGPT3.5, a groundbreaking natural language generation model that can produce coherent and diverse texts on almost any topic. This model has sparked a lot of interest and excitement among both casual and technical users. For example, some people use ChatGPT to prepare a meal plan for the whole week, while others use it to generate code, poetry, or music. ChatGPT3.5 has also inspired an avalanche of new open source tools and models, as well as new vendors that offer various AI solutions and services. In Natilik, especially in the last 6 months, we have been trying to get our head around the use case for products like Co-Pilot, which is probably the first widely used ChatGPT product. We have also been exploring the infrastructure side of AI, such as open source LLMs, RAG, storage, network infrastructure, Azure OpenAI, etc.
It is 2024 and we are, in my humble opinion, in the beginning of building a completely new infrastructure, similar to what we saw in the late ’90s. Nvidia stocks are reaching extreme heights, comparable to Cisco stocks in the early 2000s. We see Nvidia creating and selling vital tools, much like Cisco did nearly 30 years ago. As we grappled with what to do with all the performance and bandwidth then, we are now pondering what to do with all these GPUs. However, there are some major differences between now and the late 90s. In the late 90s, Cisco had multiple competitors, such as Juniper, Nortel, or Lucent, which Nvidia, except for very recently AMD, really doesn’t have. I think the dominance of Nvidia will get only stronger in the next few years. We are also seeing a difference in who the major spending driver is. It isn’t telecommunications like it was in the 90s, as was the case with Cisco, but public cloud providers. But is it going to be the same for the future? With latest Nvidia announcement to form AI-RAN alliance bring 5G and AI together to open up platform for completely new workloads we may will see some shift.
While Nvidia, AWS, Azure, and Google don’t share specific figures, it’s estimated that 80-90% of Nvidia’s datacenter revenue this year is driven by cloud providers. It is not usually enterprises that are building new AI infrastructure, but cloud providers that have lots of free cash and don’t want to miss the opportunity to be part of this next generation of the infrastructure. As an example, Nvidia and Dell backlogs are so massive that it is expected that this will continue for another 2 or 3 years.
To see the bigger picture, we have to understand that Nvidia is only the tip of the iceberg. We are seeing a coming wave of new vendors in storage and compute, such as VAST, Weka, Groq, and others. We are exploring and reshaping approaches to the things we have started doing only 12 months ago. One of the examples is AI inference. Based on the Nvidia CEO, around 40% of their shipments are being used for this purpose. We have been building these models for the last 18 months and now we are coming into the stage where we want to actually run them - efficiently and fast. Because of this, we are already starting to see the rise of new vendors in the LPU market. Sam Altman’s plan to raise trillions of dollars for chip development only confirms that we are only at the start of something much bigger. Only time will tell how AI adoption will enable these new segments to grow. There are huge opportunities in this space. Nvidia H100 consumes up to 700W and Nvidia declered that their next gen B200 will consume up to 1000W at full utilization. As the scale and usage will only get higher we are throwing virtually endless amount of power and water into running AI infrastructure. This is not sustainable.
However, don’t get me wrong, most of the AI applications in the enterprise world are either in PoC or dev stage. We are still waiting for an AI use case or application that will cause massive AI adoption, like we were waiting after the internet bubble for the applications that would use all the new bandwidth we had built in the 90s, and we eventually got with YouTube, MySpace, and Netflix. As AI infrastructure grows and AI becomes integral to areas like healthcare, education, and environmental conservation, it will prepare the platform to create these “killer” applications.
For me this is month number 18 and we are only at the beginning. The Total Addressable Market (TAM) is massive (and it is actually still growing) and the attack on Nvidia revenues and margins is only about to start, which for the enterprises will push the cost down and drive the innovation, which will then enable much wider adoption among normal users. We will see lots of new vendors who will challenge Nvidia and try to take a slice of their revenues. I still have a lot to learn, I know I’m only at the beginning. I’m currently deep with open source LLMs and really do believe open sourcing LLMs is the right way to do things. The progress which open source LLMs have done in the last 12 months is staggering. I was testing one of the first LLaMAs models 12 months ago and when you compare it to the latest models like Gemma or Mixtral, it is amazing to see the progress. Open source LLMs will be the key driver for innovations in the infrastructure space, which I’m really excited about. I’m currently experimenting with local LLMs running on my M1 laptop, using Co-pilot, and testing Azure OpenAI, so I will try to drop more info from time to time about what I’m working on or what I’m testing.