“Inside the World of LLM Building in China: Insights from an Alibaba Employee”

Chinese tech companies are gathering all sorts of resources and talent to narrow their gap with OpenAI, and experiences for researchers on both sides of the Pacific Ocean can be surprisingly similar. The parallel glimpse into their typical day reveals striking similarities, with wake-up times at 9 a.m. and bedtime around 1 a.m. Both start the day with meetings, followed by a period of coding, model training and brainstorming with colleagues. Besides building its own LLM in-house, Alibaba has been aggressively investing in startups such as Moonshot AI, Zhipu AI, Baichuan and 01.AI. Facing competition, Alibaba has been trying to carve out a niche, and its multilingual move could become a selling point.

Chinese technology companies are actively seeking out resources and talent in order to catch up to the progress of OpenAI. As demonstrated in a recent post by a researcher at Alibaba, the daily endeavors of developing powerful Large Language Models (LLMs) at the e-commerce giant are surprisingly similar to those at OpenAI.

In a parallel insight into their everyday routines, it’s clear that their schedules align in some ways. Both joined by meetings at 9 a.m., followed by coding, model training, and idea-juggling with colleagues. Even after heading home, the night continues with experimental tasks until it’s time for bed.

But it’s the differing methods of spending free time that set the two apart. For Alibaba’s own, Binyuan Hui, this meant catching up on “what is happening in the world” through research papers and perusing X. As noted by another, Hui didn’t unwind with a glass of wine like OpenAI’s Jason Wei.

This intense pace is indicative of the environment for LLMs in China, where top tech talent from prestigious universities are flocking to tech corporations in order to produce highly competitive AI models.

To an extent, the rigorous routine of Hui could represent a personal drive to imitate (if not exceed) the advancements of Silicon Valley companies in the AI sphere. This is distinct from the norm of “996” work hours associated with more “traditional” Chinese internet enterprises, typically involving extensive operations such as video games and e-commerce.

As a member of the Technical Staff at Qwen, Hui’s day might look something like this:

  1. [9:00am] Wake up, with an extra 15 minutes to remain in bed
  2. [9:30am] Taking a cab to work, catching up on the latest post from @_jasonwei on X
  3. [10:00am] Work…https://t.co/7o47EQrWcW — Binyuan Hui (@huybery) February 21, 2024

Notably, even renowned AI investor and computer scientist Kai-Fu Lee invests a considerable effort. When asked about his LLM unicorn 01.AI in an interview last November, Lee acknowledged that late nights were common but that employees were willing to put in the hard work. That same day, one staff member reached out to him at 2:15 a.m. to express excitement for being a part of 01.AI’s mission.

This dedication to intense work schedules highlights the urgency of targets set by tech firms in China, subsequently leading to the rapid development and implementation of LLMs.

At Qwen, a series of foundational models trained with both English and Chinese data have been open sourced. The largest of these comes in at 72 billion parameters, speaking to the model’s ability to generate responses in relevant contexts. (For some perspective, GPT3 from OpenAI is believed to have 175 billion parameters, while their latest LLM, GPT4, sits at 1.7 trillion. However, it can be argued that the objective of a specific LLM will be the crucial factor in deciphering the worth of high parameter numbers.)

The team has also been swift to introduce commercial applications. In April of last year, Alibaba integrated Qwen into its enterprise communication platform, Dingtalk, and its online retail site, Tmall.

As of yet, a definitive leader has yet to emerge in China’s LLM space, with venture capital firms and corporate investors scattering their bets. Aside from developing their own in-house LLM, Alibaba has also been actively investing in startups such as Moonshot AI, Zhipu AI, Baichuan, and 01.AI.

In the face of competition, Alibaba is striving to carve out their own niche, with their multilingual capabilities potentially becoming a selling point. In December, the company introduced an LLM for several Southeast Asian languages. Known as SeaLLM, it can process information in Vietnamese, Indonesian, Thai, Malay, Khmer, Lao, Tagalog, and Burmese. With a significant presence in the region through their cloud computing business and acquisition of e-commerce platform Lazada, Alibaba may have the potential to integrate SeaLLM into their services in the future.

Avatar photo
Ava Patel

Ava Patel is a cultural critic and commentator with a focus on literature and the arts. She is known for her thought-provoking essays and reviews, and has a talent for bringing new and diverse voices to the forefront of the cultural conversation.

Articles: 888

Leave a Reply

Your email address will not be published. Required fields are marked *