<\/i>","library":"fa-solid"},"layout":"horizontal","toggle":"burger"}" data-widget_type="nav-menu.default">

Top

Podcast #70: Qualcomm Driving On-device Generative AI to Power Intelligent Experiences at the Edge

Generative AIlike ChatGPT and Google’s Bard have disrupted the industry. However, they are still limited to browser windows and smartphone apps, where the processing is done through cloud computing. That is about to change soon as Qualcomm Snapdragon-powered devices will soon be able to run on-device generative AI.

At MWC 2023, Qualcomm showcasedStable Diffusionon aSnapdragon 8 Gen 2-powered Android smartphone. The demo showed how a smartphone can generate a new image with text commands or even change the background, without connecting to the internet. Running generative AI apps directly on a device offers several advantages, including lower operational costs, better privacy, security, and reliability of working without internet connectivity.

ALSO LISTEN:Podcast #69: ChatGPT and Generative AI: Differences, Ecosystem, Challenges, Opportunities

In the latest episode of ‘The Counterpoint Podcast’, hostPeter Richardsonis joined by Qualcomm’s Senior Vice President of Product ManagementZiad Asgharto talk about on-device generative AI. The discussion covers a range of topics from day-to-day use cases to scaling issues for computing resources and working with partners and the community to unlock new generative AI experiences across the Snapdragon product line.

Click the play button to listen to the podcast

You can read the transcripthere.

Podcast Chapter Markers

01:35:Ziad starts by defining generative AI and comparing it with machine learning and other types of AI.

03:56:Ziad talks about AI experiences that are already present in Snapdragon-powered devices.

06:24:Ziad addresses the scaling issue for computing resources used to train large language models.

09:46:Ziad deep dives into the types of day-to-day applications for generative AI on devices like a smartphone.

13:34:Ziad talks about the hybrid AI model, involving both cloud interaction and edge.

15:43:Ziad on how Qualcomm is leveraging its silicon chip capabilities to unlock generative AI experiences.

19:20:Ziad on how Qualcomm is working with its ecosystem and the developer community.

21:57:Ziad touches on the privacy and security aspect with respect to on-device generative AI.

Also available for listening/download on:

Podcast #69: ChatGPT and Generative AI: Differences, Ecosystem, Challenges, Opportunities

Generative AI has been a hot topic, especially after the launch of ChatGPT by OpenAI. It has even exceeded Metaverse in popularity. From top tech firms like Google, Microsoft and Adobe to chipmakers like Qualcomm, Intel, and NVIDIA, all are integrating generative AI models in their products and services. So, why is generative AI attracting interest from all these companies?

While generative AI and ChatGPT are both used for generating content, what are the key differences between them? The content generated can include solutions to problems, essays, email or resume templates, or a short summary of a big report to name a few. But it also poses certain challenges like training complexity, bias, deep fakes, intellectual property rights, and so on.

In the latest episode of ‘The Counterpoint Podcast’, hostMaurice Klaehneis joined by Counterpoint Associate DirectorMohit Agrawaland Senior AnalystAkshara Bassi谈论生成人工智能。讨论了topics including the ecosystem, companies that are active in the generative AI space, challenges, infrastructure, and hardware. It also focuses on emerging opportunities and how the ecosystem could evolve going forward.

Click to listen to the podcast

Click here to read the podcasttranscript.

Podcast Chapter Markers

01:37 –Akshara on what is generative AI.

03:26 –Mohit on differences between ChatGPT and generative AI.

04:56 –Mohit talks about the issue of bias and companies working on generative AI right now.

07:43 –Akshara on the generative AI ecosystem.

11:36 –Akshara on what Chinese companies are doing in the AI space.

13:41 –Mohit on the challenges associated with generative AI.

17:32 –Akshara on the AI infrastructure and hardware being used.

22:07 –Mohit on chipset players and what they are actively doing in the AI space.

24:31 –Akshara on how the ecosystem could evolve going forward.

Also available for listening/download on:

AI Business Model on Shaky Ground

OpenAI, Midjourney and Microsoft have set the bar for chargeable generative AI services withChatGPT(GPT-4) and Midjourney costing $20 per month and Microsoft charging $30 per month for Copilot. The $20-per-month benchmark set by these early movers is also being used bygenerativeAI start-ups to raise money at ludicrous valuations from investors hit by the current AI FOMO craze. But I suspect the reality is that it will end up being more like $20 a year.

To be fair, if one can charge $20 per month, have 6 million or more users, and run inference onNVIDIA’slatest hardware, then a lot of money can be made. If one then moves inference from thecloudto the end device, even more is possible as the cost of compute for inference will be transferred to the user. Furthermore, this is a better solution for data security and privacy as the user’s data in the form of requests and prompt priming will remain on the device and not transferred to the public cloud. This is why it can be concluded that for services that run at scale and for the enterprise, almost all generative AI inference will be run on the user’s hardware, be it asmartphone, PC or a private cloud.

Consequently, assuming that there is no price erosion and endless demand, the business cases being touted to raise money certainly hold water. While the demand is likely to be very strong, I am more concerned with price erosion. This is because outside of money to rent compute, there are not many barriers to entry andMetaPlatforms has already removed the only real obstacle to everyone piling in.

The starting point for a generative AI service is a foundation model which is then tweaked and trained byhumansto create the service desired. However, foundation models are difficult and expensive to design and cost a lot of money to train in terms of compute power. Up until March this year, there were no trained foundation models widely available, but that changed when Meta Platforms’ family of LlaMa models “leaked” online. Now it has become the gold standard for any hobbyist, tinkerer or start-up looking for a cheap way to get going.

Foundation models are difficult to switch out, which means that Meta Platforms now controls anAIstandard in its own right, similar to the way OpenAI controls ChatGPT. However, the fact that it is freely available online has meant that any number of AI services for generating text or images are now freely available without any of the constraints or costs being applied to the larger models.

Furthermore, some of the other better-known start-ups such as Anthropic are making their bestservicesavailable online for free. Claude 2 is arguably better than OpenAI’s paid ChatGPT service and so it is not impossible that many people notice and start to switch.

Another problem with generativeAI基础模型以外的服务,re are almost no switching costs to move from one service to another. The net result of this is that freely available models from the open-source community combined with start-ups, which need to get volume for their newly launched services, are going to start eroding the price of the services. This is likely to be followed by a race to the bottom, meaning that the real price ends up being more like $20 per year rather than $20 per month. It is at this point that the FOMO is likely to come unstuck as start-ups and generative AI companies will start missing their targets, leading to down rounds, falling valuations, and so on.

There are plenty of real-world use cases for generativeAI, meaning that it is not the fundamentals that are likely to crack but merely the hype and excitement that surrounds them. This is precisely what has happened to theMetaversewhere very little has changed in terms of developments or progress over the last 12 months, but now no one seems to care about it.

(This is a version of a blog that first appeared on Radio Free Mobile. All views expressed are Richard’s own.)

相关的帖子

Artificial Intelligence: Irrational Exuberance is in Full Swing

As surely as autumn and winter follow summer, the current exuberance aroundAIis not going to last simply because the machines remain incapable of living up to the expectations that have been set for them.

These cycles typically take the form of a discovery of some description followed by a ramping of expectations which in turn leads to large amounts of money being invested for fear of missing out (FOMO).

The problem is that the expectations that are set are always unrealistic, meaning that when the time comes to deliver on those expectations, disappointment sets in. This is followed by collapsing valuations, bankruptcies and forced consolidation as investors are no longer willing to suspend disbelief.

This is the fourth AI Hype cycle with the others occurring in the 1960s, 1980s and 2017-2019, and this hype cycle looks exactly the same as the others except that it is much larger. Looking at investment activity and news flow, it is also very clear exactly where we are in the cycle.

First, expectations

  • The ability of Large Language Models (LLMs) to mimic human behavior has convinced some of the big names (like Professor Geoffrey Hinton) that artificial superintelligence is now materially closer than it was before.
  • While LLMs do have some very useful and lucrative use cases, they still have no causal understanding of the tasks they are performing.
  • This is why they hallucinate, make the most basic factual errors and are generally completely unreliable.
  • Therefore, the machines remain as stupid as ever. There is no evidence whatsoever that these machines are able to think.
  • But the problem is that they are so good at pretending to think that they are able to fool the great minds that created them.
  • Instead, all they do is calculate statistical relationships, meaning that the big promises that have been made will not be kept.

Second, investment

  • There are already many examples of money being thrown at start-ups with valuations and fundamentals being an afterthought:
  • OpenAI’s $30-billion valuation with a corporate culture that doesn’t want to make any profit.
  • Inflexion AI raising $1.3 billion fromMicrosoftandNVIDIAat an estimated valuation of around $5 billion despite having only been around for a year and having no commercial product.
  • Mistral AI raising $113 million at a $260-million pre-money valuation despite being only a few weeks old with no revenues, no product and probably only the vaguest idea of what it is going to do.
  • This can be described as the very definition of a bubble where rationality gets lost in the mad rush toward the next big thing. A lot of shirts are going to be lost.

The latest innovations around LLMs have produced some remarkable abilities which, no doubt, will be put to both good and lucrative use. However, the technology upon which they are based has not changed, meaning that the limitations that preventeddigitalassistants and autonomous driving from being useful for anything more than the most basic tasks are also going to trip LLMs up.

Furthermore, this is no longer the exclusive realm of the big, well-financed companies that can pay tens of millions of dollars for massive compute capacity, as the hobbyists and enthusiasts are now creating generative AI.MetaPlatforms’ series of LLMs called LlaMa are now freely available to anyone who wants to tinker and advances in training techniques have meant that it is possible to fine-tune a 7bn parameter model on a powerful laptop.

This is why there are models popping up all over the place that are completely free to use. Some of them actually work quite well. Hence, the pricing of $20 per month for services likeGPT-4, Perplexity AI and Midjourney may soon come under relentless pressure. This is really bad news for investors relying on spreadsheets for their return because no one seems to have modeled this scenario out.

The first sign of trouble will come when companies come back to the market after spending the money on fancy offices and expensive staff but nothing to show for the investments so far. This is when the down rounds begin, disillusionment sets in, reality makes its presence felt and winter begins.

One suspects this will begin sometime in the first half of 2024 and the fallout will not be pretty.

(This is a version of a blog that first appeared on Radio Free Mobile. All views expressed are Richard’s own.)


相关的帖子

AI Needs to Reside in the Vehicle to Work Well

Mercedes is running a beta program where those that opt in will be able to accessChatGPTfrom their vehicle by interacting with thevoice assistant已经出现在MBUX-equipped车辆。但是老鼠her than the cloud-based service that Mercedes is going with today, it should be looking at implementing ChatGPT directly in the vehicle.

  • Mercedes owners in theUScan enroll for the program by accepting an update for their car.
  • The test is due to run for three months and is being supported by Microsoft’s Azure OpenAI Service, which is an API to which clients can connect their services to havegenerative AIfunctionality.
  • Mercedes is able to implement this service very easily because all it is really doing is providing a prompt for the vehicle assistant to fill in, send it to the cloud and then read out the results.
  • This means that all of the inference or processing of the request will be done in the cloud with the voice assistant doing nothing more than acting as a front end to provide the voice functionality.
  • The vehicle is a use case where generative AI could have a disproportionately large impact. This is because a touch-based icon grid is a substandard user experience no matter who provides it.
  • The problem that the car makers have is that their icon grid is much worse than Apple, Gooxgle orTesla.
  • Furthermore, in 2016 and 2017 we concluded that voice was the leading contender to improve the digital experience in the vehicle but that voice was not good enough to create an acceptable user experience.
  • This is why vehicles are still limping along with smartphones embedded in the dashboard.
  • We have also concluded that generative AI represents a significant step forward in the ability of machines to communicate with humans and provide a user interface for a digital service.
  • Consequently, generative AI offers a significant opportunity for vehicle makers to win back the digital initiative that they have ceded to the digital ecosystems.
  • This is extremely important as vehicle makers’ ability to monetize the market for in-vehicle digital service will be contingent on their ability to remain relevant in the digital vehicle experience.
  • This is why Apple andGoogle正在积极后车辆和far, the OEMs have mounted feeble resistance or offered complete capitulation.
  • The problem with this approach is that the only way to implement generative AI effectively in the vehicle is to put it directly in the vehicle.
  • This is because reliability and speed are critical, and in this example when the network goes, the service goes with it.
  • Furthermore, it is unlikely that there will be any real integration with the vehicle, meaning that telling ChatGPT that one is feeling hot is likely to result in silence rather than the air-conditioner being turned up.
  • Using ChatGPT as the benchmark implementation in the vehicle will have a profound impact on the cost of the vehicle’s electronics as well as its power consumption which in anEVis a deal breaker.
  • There are rapid developments going on in the open-source community that may make this a lot easier to achieve, but implementing large language models outside of the data center remains a work in progress.
  • Despite the current limitations, the potential for generative AI to help OEMs to overcome their digital shortcomings is substantial and represents one of the best opportunities the OEMs have had for a long time.
  • The risk is that if no one uses it as a result of the way the Mercedes experiment is implemented, it will lead to the (wrong) conclusion that putting it in the vehicle is a waste of time.
  • This would lead to the squandering of another opportunity, resulting in digital irrelevance and greater commoditization.
  • We remain pretty pessimistic about the outlook for the OEMs.

(This is a version of a blog that first appeared on Radio Free Mobile. All views expressed are Richard’s own.)

相关的帖子

Google I/O 2023 Key Highlights: Generative AI, Search, New Pixel Devices and More

Googleheld its annual developer conference, I/O 2023, at the Shoreline Amphitheatre in Mountain View, California. The company’s biggest event of the year was attended by developers, media, and partners. At the conference, Google offered a glimpse of new features powered by generativeAI, and how it is changing the landscape of core products from Gmail to Sheets, Slides, and Google Photos. The company also showcased the advanced capabilities of its chatbot Bard. Google also made hardware announcements by launching its firstfoldable smartphone, the Pixel Fold, an affordable Pixel 7a smartphone, and the Pixel Tablet. There is a lot to talk about, and here are some key highlights from Google I/O 2023.

Generative AI takes centerstage

“AI is having a very busy year, so we’ve got lots to talk about,” said Sundar Pichai, as he started his keynote address. Google spent over one hour talking about AI and the company’s “bold and responsible” approach.

Gmail getting a new “help me write” feature

After features like Smart Reply and Smart Compose, Gmail is now getting a new feature called “help me write.” It is a simple feature that can help you save time and effort when composing emails. Say your flight was just canceled and you want to write an email asking for a refund. The new feature can grab flight details from the airline cancelation email and compose a draft email for you. If you think it is too small, there is an option to “Elaborate” to make it more compelling, or you can even click on “Recreate” for a completely new email. The feature will start rolling out as a part of Google Workspace in the coming weeks.

Google Maps gets a new Immersive View of routes

When using Maps for navigation, users can click on the “Immersive View” option to get a photorealistic view of the route. But that’s not all, it will also show you the air quality index (AQI), real-time weather updates and the weather forecast as well. The feature will be rolled out to 15 cities around the globe by the end of 2023.

Google Photos get Magic Editor

According to Google’s data, 1.7 billion photos are edited on Google Photos every month. After introducing Magic Eraser which lets you remove unwanted objects and people from photos, Google is now taking photo editing to the next level with Magic Editor. It uses generative AI and semantic engineering to help you edit and enhance your images.

In addition to removing unwanted objects, the Magic Editor can also fill in parts of the image that are not in the photo. Google showcased this feature with an example where you can move the subject in the photo and other objects too. Magic Editor can then add cropped-off parts like balloons and even extend objects such as benches, like in the examples below. Though the results are not perfect, as clearly evident in the below examples, it is still amazing what generative AI can achieve. The feature is coming to Google Photos later this year.

PaLM 2, Google’s latest LLM announced

The latest large language model (LLM) from Google, PaLM 2, can perform a broad range of topics from natural language generation to writing code, reasoning and even multilingual translation. It even supports over 100 spoken languages. More than 25 Google products are now using PaLM 2. Available in different sizes named Gecko, Otter, Bison and Unicorn – Gecko is small enough to run on a smartphone, even in offline mode.

Google is also deeply invested in AI responsibility, where AI-generated content will have metadata and watermarking to identify the content.

Google Bard gets better

Google’s AI chatbot, Bard, can now perform coding and debugging in over 20 programming languages including C, C++, JavaScript and Python. You can ask Bard for must-see places, write funny captions based on photos, or even ask for suggestions for colleges and the different teaching programs they offer, thus helping with career advice. Bard replies can also be exported in Docs or in Gmail. The company is also opening Bard access to users in over 180 countries in English.

Generative AI in Google Workspace

Google is also bringing generative AI to Workspace, including Docs, where it provides writing help. For instance, you can ask AI assistant to write a job description for a sales representative position, covering letter, a video script, product descriptions, invitations and more.

In Slides, you can create and add auto-generated images, video and audio clips to add flair to a presentation. In Google Sheets, you can ask the assistant to create a roster with rates, or pull out insights and analysis from raw data, and more.

Google Search supercharged with generative AI

Lastly, Google also talked about bringing generative AI to Google Search which can answer queries by summarizing text information found online. Users can then ask follow-up questions to get even more specific answers. Google demonstrated with an example of a search query for e-bikes. The algorithms can list and summarize product reviews from various websites and offer a link to purchase online. Interested users in the US can give this a try with a new feature calledSearch Labs, but it will not be activated by default for all users.

Commenting on Google’s generative AI announcements at I/O 2023, senior analystAkshara Bassisaid, “Google is integrating all its core products with AI which will make AI more reachable to the masses and integrated into their lives. The reinvention of basic tools such as Gmail with the introduction of ‘help me write’ feature is possible because of AI. The AI-powered immersive Google Maps promises to enhance the journey and plan it better. Google promises to become a one-stop destination across all its services from Photo editing to Google Search and integration of AI in all those services will accelerate the trajectory of AI as a ubiquity in our lives.”

With these announcements, Google is showing it can respond to the threat posed byMicrosoft. Google had previously seemed to be taken by surprise by the completeness of Microsoft’s offerings and appeared defensive in the face of the competitive threat. However, during the I/O event, Google seemed more composed, and its array of AI-based applications and services shows it has likely headed off the threat to its core search business for now.

Google Pixel hardware announcements at I/O 2023

As usual, Google also made hardware announcements starting with the affordable Pixel 7a smartphone, the Pixel Tablet, and the much-awaited Pixel Fold. All these devices aim to offer the best of Google’s hardware and software experience. Under the hood, all three devices are powered by the custom Google Tensor G2 SoC which brings features like Pixel Call Assist which helps avoid long wait times, navigate phone tree menus, and more. On-device Machine Learning also enables enhanced speech features like live translation and transcribe. The devices also offer enhanced security with Titan M2 security chip, secure face unlock, VPN, crisis alerts and car crash detection among other features.

谷歌7像素:Price, key specifications, and more

The Google Pixel 7a is available for $499 and comes with 8GB of LPDDR5 RAM and 128GB of UFS 3.1 storage. The smartphone comes with dual SIM options (physical SIM and eSIM) and supports Wi-Fi 6e and 5G. It features a 4385mAh battery and is capable of wireless charging.

counterpoint google i-o 2023 pixel 7a
Source – Google

The smartphone has a 6.1-inch FHD+ OLED display with a 90Hz refresh rate, a 64MP primary camera with OIS, a 13MP ultrawide camera and a 13MP front camera. Users get all the Pixel camera features as the Pixel 7 Pro, including Magic Eraser, Photo Unblur, Long Exposure, Top Shot and more. In terms of videography, there is support for up to 4K 60fps for the rear camera and 4K 30fps for the front camera. Stereo recording, wind noise cancellation, and speech enhancement features are also present.

Commenting on the Pixel 7a launch, associate directorHanish Bhatiasaid, “For Google, the US market accounts for a significant share of the overall Pixel sales. Although Google has a low single-digit market share in the US, it is significant when we look at the overall premium ($600+) Android market. The key focus with Pixel 7 and 7 Pro was to plug the flow of premium Android users into iOS. But iOS has also gained in the sub-$400 prepaid market in the US with the iPhone SE and subsidized iPhone 11 series. This is where the “Pixel A series” is key. However, the Pixel A series still focuses on post-paid rather than prepaid.”

Google Pixel Fold: Price, specifications, and more

像素褶皱,谷歌的第一个可折叠在这里,nd while Google is late to join the foldable race, it has learned from existing foldable products from other OEMs. For its first device, Google has gone with a book-type foldable with a compact form factor like theOPPO Find N, rather than the taller form factor like theSamsung Galaxy Z Fold4. Our recent consumer insights study revealed that28% of US smartphoneusers are likely to opt for a foldable as their next purchase, and it makes sense for most OEMs to focus on foldables as an important revenue driver.

Available for $1,799, which is the same as the Galaxy Z Fold4, Google has sweetened the deal with a pre-booking offer where users will receive a free Pixel Watch worth $349 (Wi-Fi), or $399 (LTE).

counterpoint google i-o 2023 pixel fold overview
Source – Google

Google claims it is the thinnest foldable smartphone in the market, and it comes with an IPX8 rating for water and dust resistance. In terms of specifications, the Pixel Fold comes with a 5.8-inch wide FHD+ cover display, and a 7.6-inch internal foldable display. Both feature OLED panels and come with a 120Hz screen refresh rate. There is 12GB of LPDDR5 RAM and 256GB of UFS 3.1 storage, a 4821mAh battery with fast wired charging and wireless charging.

In the photography department, the foldable smartphone is equipped with triple rear cameras – a 48MP main sensor, a 10MP ultrawide camera and a 10MP sensor with a telephoto lens that supports 5x optical zoom, and 20x super res zoom. The front cover screen has a 10MP selfie snapper. There is also a fifth 8MP camera on the inner folding screen for selfies and video calling.

Commenting on the Google Pixel Fold launch, senior analystMaurice Klaehnesaid, “Foldables are increasingly becoming an important revenue stream for OEMs as sales continue to grow. The Pixel Fold will help Google’s share in the ultra-premium segments in markets such as the US where Samsung has been the de facto market leader as there is no other competition. The EU market is similar but has a slightly wider selection of foldables. In Asia, having a foldable is now table stakes. At $1,799, the Pixel Fold is priced in line with the competition, but Google can better optimize the experience with the latestAndroid 13, which builds on the tablet-focusedAndroid 12L.”

Google Pixel Tablet: Price, specifications, and more

Lastly, Google’s Pixel Tablet, which was announced last year, will soon be available for purchase. The device is priced at $499 for the 8GB RAM and 128GB storage model, and $599 for the 256GB model. Google is also bundling the magnetic charging speaker dock for free with the tablet.

counterpoint google i-o 2023 pixel tablet
Source – Google

The Pixel Tablet features a 10.95-inch WQXGA (2,560×1,600pixels) LCD screen with an aspect ratio of 16:10. It also supports the USI 2.0 stylus pen. The tablet comes with quad speakers and three mics for calls and recording. The Google Pixel Tablet features an 8MP camera in the front and an 8MP camera at the back. With a 27Wh battery, the tablet supports 15W wired charging.

The tablet connects with the magnetic dock using a Pogo pin connector, which then doubles as a charging device and an additional 43.5mm full-range speaker. This way, it can become a smart display or a smart speaker with a display, just like the Google Nest Hub. There is also a built-in Chromecast feature in the tablet, allowing you to stream music and videos right from your phone to the tablet and enjoy an immersive audio experience.

Key Takeaways:

• Google is going all-in with generative AI to enhance the overall experience on its range of products and services. It has likely done enough to neutralize the perceived threat from Microsoft – for now.
• Features like immersive view on Google Maps and Magic Editor on Google Photos are valuable additions to popular apps.
• The Pixel Fold launch shows Google’s ambition to grab share in the ultra-premium segment and staunch the gradual bleed of Android users to iOS.
•像素的平板电脑是一个聪明的,显示Google’s renewed interest in large-screen devices and supporting the company’s ambition in smart homes. It can also help in the development of Android OS for different form factors.
• With the Pixel 7a, Google is looking to attract more users to the platform while showcasing its hardware and software capabilities. Again, it is responding to the threat from Apple that has been winning users over to iOS with its iPhone SE and older number series, e.g. the iPhone 11.

相关的帖子

AI Voice Assistants to Push Success of Autonomous Driving, Software-defined Vehicle

  • AI voice assistants are being integrated into cars for hands-free and intuitive functionality.
  • Voice assistants likeGoogle AssistantandApple Sirican recognize and respond to natural language commands, allowing drivers to interact with their vehicles more effectively.
  • Integrating natural voice virtual assistants is complex and requires significant resources and expertise in learning and data collection. As a result, only a few companies can currently do it successfully.

ChatGPT’s popularity has encouraged many people to think about AI’s potential applications. One of them is in theautomotivesector. With the simplification of the dashboard in vehicles, there has been a trend towards integrating more functions into the central display, such asnavigation, entertainment, climate control and vehicle diagnostics. The central computer in vehicles is becoming more powerful and can do more things. All this allows easier and more user-friendly ways for drivers to interact with their vehicles while enabling more advanced and customizable functions for the vehicle itself.

Also, this has matched the development ofsoftware-defined vehicles, which take this integration a step further by using a centralized software architecture to control all vehicle functions. This allows for greater flexibility and the ability to update vehicle systems over the air (OTA).

There has been an increasing demand for additional functions to be integrated into the central display, such as voice assistant, in-car digital assistant, and other advanced driver assistance systems (ADAS). However, oversimplification leads to many problems. Some people still like to use knobs or buttons in the auto cabin, despite the prevalence of touchscreen displays in modern cars. Below are some reasons:

  • Tactile feedback:许多人发现它更直观的使用这些ph值ysical controls than to navigate through a digital menu on a touchscreen display. Knobs and buttons provide physical feedback when they are pressed or turned, which can make it easier to interact with the controls without taking your eyes off the road.
  • Visibility:In some cases, knobs and buttons can be easier to see and use in bright sunlight or other challenging light conditions, as they do not suffer from glare or reflections in the same way that a touchscreen display might.
  • Safety:Using physical knobs and buttons can be safer than interacting with a touchscreen display, as it allows the driver to keep their hands on the wheel and their eyes on the road.

Therefore, it is crucial to have a simplified human-machine interface (HMI) on the central screen of a car that is user-friendly, reliable and intuitive in order to minimize the learning curve for drivers and enable them to easily and efficiently access the desired features without encountering any errors. The most important of these is the virtual voice assistant.

有几种流行的虚拟语音助手available in the market today, like Amazon Alexa, Google Assistant, Apple Siri, Microsoft Cortana, Samsung Bixby, Baidu Duer and Xiaomi Xiao AI. In addition, there are other proprietary virtual voice assistants designed specifically for the automotive industry, such as Cerence, SoundHound Houndify, Harman Ignite and Nuance Dragon Drive.

The majority of these virtual assistants in the automotive industry are created to seamlessly integrate with the vehicle infotainment systems to offer drivers a variety of voice-activated functionalities, including hands-free phone calls, weather updates, music streaming, and voice-activated navigation. Moreover, they are designed to recognize and respond to natural language commands, enabling drivers to engage with their vehicles in a more intuitive and effortless manner. By providing a safe and convenient way to interact with vehicles, these virtual voice assistants allow drivers to keep their hands on the wheel and eyes on the road.

While virtual voice assistants have improved significantly in recent years, there are still some challenges that need to be addressed. Here are some common problems that currently exist with virtual voice assistants:

  • Understanding complex commands:Virtual voice assistants may encounter difficulties in comprehending intricate commands or requests that involve several variables or conditions.
  • Accents and dialects:Virtual voice assistants may also have difficulty understanding users with different accents or dialects.
  • Background noise:Virtual voice assistants can be sensitive to background noise, which can make it difficult for them to understand user commands or requests.
  • Privacy concerns:As virtual voice assistants become more ubiquitous, there are growing concerns about the privacy of user data.
  • Integration with other automotive systems:Virtual voice assistants may have difficulty integrating with other systems or devices, which can limit their functionality and usefulness.

ChatGPT can speak the natural language and converse like a human because it is a language model that has been trained on a massive amount of text data using a deep-learning technique called transformer architecture. During its training, ChatGPT was exposed to vast amounts of natural language text data, such as books, articles and web pages. This allowed it to learn the patterns and structures of human language, including grammar, vocabulary, syntax and context.

Unlike broad-based training methods, natural language training, such as that offered by ChatGPT, allows for the development of models that are finely tuned to specialized data sets, which may include frequently used vehicle commands or a range of distinct national accents. The model is then fine-tuned by further training it on the large corpus of unlabeled data to improve its language understanding capabilities.

The following figure shows our forecast for the use of intelligentvoice control in cars.


Source: Global Automotive ADAS/AD Sensor Forecast by the Level of Autonomy, 2021-2030F

Overall, the potential of natural language voice conversation assistants in cars is vast, and with ongoing research and development, we can expect to see more advanced and sophisticated voice assistants in the future. Developing a successful natural language virtual voice assistant for use in cars is a complex and time-consuming process that requires multiple iterations of training and fine-tuning.

Since the development necessitates a considerable amount of data, computational resources and expertise, only a handful of companies such as Microsoft, Tesla, NVIDIA,Qualcomm, Google and Baidu have the resources to undertake this work. The development of the technology is estimated to take three to four years. There will be an increased demand for vehicles above Level 3.

As highlighted in our report “Should Automotive OEMs Get Into Self-driving Chip Production?”, the automotive industry will confront obstacles related toelectrification and intelligent technology, necessitating sustained capital investments and support from semiconductor suppliers. Consequently, only a handful of established car manufacturers with considerable economies of scale will be able to finance these initiatives. The growing popularity of natural voice control in cars will only intensify these challenges.

Related Blogs:

Related Reports:

Term of Use and Privacy Policy

Counterpoint Technology Market Research Limited

Registration

In order to access Counterpoint Technology Market Research Limited (Company or We hereafter) Web sites, you may be asked to complete a registration form. You are required to provide contact information which is used to enhance the user experience and determine whether you are a paid subscriber or not.
Personal Information When you register on we ask you for personal information. We use this information to provide you with the best advice and highest-quality service as well as with offers that we think are relevant to you. We may also contact you regarding a Web site problem or other customer service-related issues. We do not sell, share or rent personal information about you collected on Company Web sites.

How to unsubscribe and Termination

你可以请求终止您的帐户或到凶手scribe to any email subscriptions or mailing lists at any time. In accessing and using this Website, User agrees to comply with all applicable laws and agrees not to take any action that would compromise the security or viability of this Website. The Company may terminate User’s access to this Website at any time for any reason. The terms hereunder regarding Accuracy of Information and Third Party Rights shall survive termination.

Website Content and Copyright

This Website is the property of Counterpoint and is protected by international copyright law and conventions. We grant users the right to access and use the Website, so long as such use is for internal information purposes, and User does not alter, copy, disseminate, redistribute or republish any content or feature of this Website. User acknowledges that access to and use of this Website is subject to these TERMS OF USE and any expanded access or use must be approved in writing by the Company.
– Passwords are for user’s individual use
– Passwords may not be shared with others
– Users may not store documents in shared folders.
– Users may not redistribute documents to non-users unless otherwise stated in their contract terms.

Changes or Updates to the Website

The Company reserves the right to change, update or discontinue any aspect of this Website at any time without notice. Your continued use of the Website after any such change constitutes your agreement to these TERMS OF USE, as modified.
Accuracy of Information: While the information contained on this Website has been obtained from sources believed to be reliable, We disclaims all warranties as to the accuracy, completeness or adequacy of such information. User assumes sole responsibility for the use it makes of this Website to achieve his/her intended results.

Third Party Links: This Website may contain links to other third party websites, which are provided as additional resources for the convenience of Users. We do not endorse, sponsor or accept any responsibility for these third party websites, User agrees to direct any concerns relating to these third party websites to the relevant website administrator.

Cookies and Tracking

We may monitor how you use our Web sites. It is used solely for purposes of enabling us to provide you with a personalized Web site experience.
This data may also be used in the aggregate, to identify appropriate product offerings and subscription plans.
Cookies may be set in order to identify you and determine your access privileges. Cookies are simply identifiers. You have the ability to delete cookie files from your hard disk drive.