头图

May 12, 2022

About 24 years ago, when Google was founded by two graduate students, Google had a product and a lofty mission: to bring the world's information together and make it available to everyone for the benefit of everyone. In the decades since, we have continued to advance our technology to achieve this mission.

The progress we've made stems from our years of investment in advanced technologies, from artificial intelligence to the technological infrastructure that powers it all. Once a year, on my favorite day of the year :) We share updates with you at Google I/O.

Today, I talked about how we can advance two fundamental aspects of our mission—the development of knowledge and computing—to create products that can help. It's exciting to develop these products; it's even more exciting to see what these products can help people do.

Thanks to everyone who helped us get this done, especially our colleagues at Google. We are grateful for this opportunity.

-Sundar

Below is the full text of Sundar Pichai's keynote speech at the opening ceremony of today's Google I/O developer conference.

Hello and welcome everyone! It's great to be back at Shoreline Amphitheatre after three years! I want to say to the thousands of developers, partners and colleagues at Google, "It's great to meet you all" and to the millions of viewers who are watching this conference around the world, "It's a pleasure to meet you. Everyone can come."

Last year, we shared new breakthroughs in some of the most technically challenging areas of computer science, and how they can make Google's products even more useful at critical times. We do all of this in pursuit of Google's eternal mission: to bring the world's information together, for everyone to use and to benefit everyone.

I can't wait to show you how Google is advancing this mission in two key ways: first, by deepening our understanding of information so that it can be turned into knowledge; and second, by advancing computer technology, whether Wherever we are, we can easily and quickly obtain information and knowledge.

Today, you'll see the progress we've made in both of these ways, and how it's ensured that Google's products are accessible to the masses. I'll start with a few simple examples. During the COVID-19 pandemic, Google has been committed to providing accurate information to help everyone stay healthy. Last year, nearly 2 billion searches were made using Google Search and Google Maps to find vaccination sites.

Last year, Google's flood prediction technology sent flood warnings to 23 million people in India and Bangladesh

We've also improved Google's flood prediction technology to keep people exposed to natural disasters safe. During last year's monsoon season, we issued flood warnings to more than 23 million people in India and Bangladesh. By our estimates, this helped thousands of people evacuated in time.

Google Translate adds 24 new languages

In countries around the world, Google Translate has become an important tool for foreigners and local residents to communicate with each other.

Using machine learning, we've added new languages to Google Translate, including Quechua

The existence of real-time translation confirms that knowledge and computer technology can work together to make people's lives better. There are now more people using Google Translate than ever before, but we can't stop there and make Google Translate more common. There are also many languages that don't appear that often on the Internet, and translating these languages is a technical challenge because the texts used to train translation models are often bilingual texts, like English and Spanish versions of the same phrase, but not all Languages have sufficient publicly available bilingual texts.

So, with advances in machine learning, we have developed a monolingual approach that allows translation models to learn to translate a new language without ever seeing a literal translation of it. Working with native speakers and local agencies, we have found that the quality of translations done in a monolingual way is up to par, and we will further improve the quality of our translations.

We've added 24 new languages to Google Translate

Today, I'm excited to announce that we've added 24 new languages to Google Translate, including the first additions to Native American languages. Together, more than 300 million people speak these 24 languages, and breakthroughs like these are driving a fundamental shift in the way we acquire knowledge and use computers.

Google Maps new upgrade

Much of what is knowable about our world goes beyond language—it exists in the physical and geographic space around us. For more than 15 years, Google Maps has been making this information rich and useful to help users navigate. Advances in AI are taking this work to new heights, whether it's extending our reach to remote areas or reimagining how to explore the world in a more intuitive way.

Advances in AI help map remote and rural areas

To date, we have mapped some 1.6 billion buildings and more than 60 million kilometers of roads around the world. Mapping buildings and roads in remote and rural areas has previously been difficult due to a lack of high-quality images and clear building types and geomorphic features. To solve this problem, we are using computer vision and neural network techniques to detect buildings from satellite imagery. Since July 2020, the number of African buildings on Google Maps has increased fivefold, from 60 million to nearly 300 million.

We also doubled the number of buildings we painted in India and Indonesia this year. Globally, more than 20% of buildings on Google Maps have been detected using these new technologies. Building on this, we have also made publicly available building datasets in Africa, which are already being used by international organizations such as the United Nations and the World Bank to better understand local population density and provide support and emergency assistance.

Immersive View in Google Maps blends aerial and Street View imagery

We've also brought new features to Google Maps. Using advances in 3D mapping and machine learning technology, we are merging billions of aerial and Street View imagery to create a new high-fidelity map. We have brought these ground-breaking technologies together to provide users with a new experience of immersive views that allow users to explore a place like never before.

Let's go to London and see. Suppose the user plans to visit Westminster with his family. Users can now get this immersive view right from Google Maps on their phone, and move around the sights...that's Westminster Abbey. If the user wants to go to Big Ben, they can see on the map if the road to Big Ben is congested, and even see the weather forecast. If they also want to grab a bite to eat during their visit, the user can check out nearby restaurants and take a look inside the restaurant.

Surprisingly, this isn't a drone flying around a restaurant - it's an experience we're using neural rendering to create from just images. Google Cloud Immersive Stream makes this experience work on almost any smartphone. The feature will roll out to select cities around the world in Google Maps later this year.

Another major upgrade to Google Maps is our introduction of eco-friendly routes. The feature, launched last year, shows users the most fuel-efficient routes, offering more fuel-efficient and carbon-reducing options. Eco-friendly routes have been rolled out in the US and Canada, and users have driven about 86 billion miles on these routes, saving about 500,000 metric tons of carbon emissions, the equivalent of taking 100,000 cars off the road.

Eco-friendly route to expand to Europe later this year

I'm excited to share that we're expanding this feature to more places, including Europe later this year. In the example of the Berlin map, users can choose a route that is only three minutes slower, reducing fuel consumption by 18 percent. These small decisions can have huge impacts. As this feature expands to Europe and beyond, we expect to double the amount of carbon emissions that can be saved by the end of this year.

We also added a similar feature to Google Flights. When users search for flights between two cities, we also show users carbon estimates and other information such as prices and schedules, making it easy for users to choose greener flights. These eco-friendly features in Google Maps and Google Flights are very important to our goal of empowering 1 billion people to make more sustainable choices through our products, and we're excited to see these progress.

New YouTube feature helps users easily access video content

In addition to Google Maps, video is becoming an indispensable vehicle for us to share information, communicate with each other and learn from each other. Many times users want to find a specific segment in a video after entering YouTube, and we hope to help users get the information they need more quickly.

Last year, we introduced auto-generated chapters to make it easier for users to jump to the sections of greatest interest. This is also a great feature for creators as it saves creators time in making chapters. We are now applying DeepMind's multimodal technology, which uses text, audio, and video simultaneously and automatically generates chapters with greater accuracy and faster speed. With this feature, we now aim to increase the number of videos with auto-generated chapters by a factor of 10, from 8 million today to 80 million next year.

Often, the quickest way to understand what a video is about is to read its script, so we're also using a speech recognition model to transcribe the video. Scripts for videos are now available to all Android and iOS users.

Auto-generated chapters on YouTube

Next, we'll apply the automatic subtitle translation from YouTube to mobile devices. This means viewers now have access to 16 automatically translated video subtitles, allowing creators to reach global audiences.

Google Workspace helps boost productivity

Just as we're using AI to improve YouTube's capabilities, we're building AI into the Workspace family of products to help people be more productive. Whether you work in a small business or a large institution, you probably spend a lot of time reading documents. Maybe you recall the panic right now when you have a 25-page document to read and the meeting starts in 5 minutes.

At Google, whenever I get a long document or email, I look for "TL;DR" at the top—short for "too long unread". This got us thinking, wouldn't life be better if more things could have "TL;DR"?

That's why we've introduced automatic summaries for Google Docs. Apply one of our machine learning models to the text summarization feature, and Google Docs will automatically parse words and extract key points.

This marks a big leap in natural language processing. Summarizing requires understanding long paragraphs, compressing information, and generating language that is beyond the capabilities of the best machine learning models of the past.

And documentation is just the beginning. We're working on bringing summaries to other Google Workspace products. Over the next few months, Google Chat will use the feature to provide summaries of chat conversations, helping users quickly join group chats or review important information.

In the coming months, we'll be adding summaries to Google Chat

And we're working on bringing transcription and summarization to Google Meet. This way, users can quickly make up missed parts of important meetings.

Improve Google Meet video

Of course, many times you really, really want to have a virtual room where you can be with other people. That's why we continue to improve audio and video quality, inspired by Project Starline. We introduced Project Starline at last year's I/O conference. We're always testing across Google offices, seeking feedback and improving the technology for the future. In the process, we discovered some techniques that can be applied to Google Meet right away.

Project Starline inspired machine learning-driven image processing that automatically improves image quality in Google Meet. And the technology works on all types of devices, so you can look your best wherever you are.

Machine learning-driven image processing that automatically improves image quality in Google Meet

We also brought studio-scale virtual lighting to Google Meet. You can adjust the position and brightness of the lights, so users can still be seen clearly even in a dark room or sitting in front of a window. We're testing this feature to make sure portraits are more realistic, and it's an advance of what we've been doing with Real Tone and Monk Scale on Pixel phones.

These are just some of the ways in which AI can be used to improve our products: to make them more helpful, more accessible, and to provide everyone with innovative new features.

Today at I/O, Prabhakar Raghavan shares how we're helping people find useful information in a more intuitive way with Google Search

Making knowledge more accessible through computation

We've talked about how we advance the acquisition of knowledge as part of our mission: from better language translation, to improved search experiences across images and videos, to richer exploration of the world using maps.

Now we are committed to making this knowledge more accessible through computation. The journey we've taken in computing is an exciting one. Every transition from desktop to web to mobile to wearable, and ambient computing has made knowledge more useful in our daily lives.

As helpful as our devices were, we had to work quite hard to get used to them. I've always believed that computers should adapt to people, not people to adapt to computers. We will continue to pursue progress on this front.

Here's how we're using Google Assistant to make computing more natural and intuitive.

Launched LaMDA 2 and AI Test Kitchen

Demo of LaMDA, the generative language model we developed for dialogue, and AI Test Kitchen

We are continuing to work on improving the conversational capabilities of AI. Both dialogue and natural language processing make it easier for people to use computers. Large language models are the key to making this happen.

Last year, we released LaMDA, a generative language model for conversational applications that enables conversations on any topic. Today, we're excited to unveil LaMDA 2, the most advanced conversational AI Google has ever built.

At present, the practical application of these models is still in its infancy, and it is our responsibility to continuously improve them. To make progress, we need users to experience the technology and provide feedback. We've opened up LaMDA to thousands of Google colleagues who are willing to test and learn about its capabilities, significantly improving the quality of its conversations and reducing inaccurate or offensive responses.

That's why we're developing AI Test Kitchen, a new way to explore AI capabilities with a wider range of users. There are several different ways to experience AI Test Kitchen, each designed to give users an idea of how they use LaMDA in real life.

The first demo is "Imagination", which tests whether the model can understand the ideas provided by the user and generate imaginative and relevant descriptions. These experiences are not products, they just allow us to explore with you what LaMDA can do. The user interface is very simple.

Say you're writing a story and need some inspiration. Maybe one of your characters is exploring the deep ocean, then you can ask LaMDA what it would feel like to be in that situation. Here, LaMDA depicts a scene in the Mariana Trench, and it even generates follow-up questions on the fly. You can ask LaMDA to imagine what kind of creatures might live there. It is important to emphasize that we did not manually program some specific topics, such as submarines or bioluminescence, but LaMDA itself integrated these concepts based on training data. That's why you can ask about almost anything: Saturn's rings, even "a planet made of ice cream."

Not going off topic is a big challenge for language models. In building the machine learning experience, we wanted it to be both open enough for people to explore where curiosity takes them, and focused on the topic itself. Our second demo shows how LaMDA does this.

In this demo, we set the model to focus on dog-related topics. It starts by generating a conversation-starting question: "Have you ever wondered why dogs like to play picking things up so much?" If you ask a follow-up question, you'll get a more granular answer: Dogs find it interesting, which is similar to dogs The sense of smell is related to the sense of hunting.

Users can start a follow-up conversation about any aspect. Maybe you're curious about how a dog's sense of smell works and want to dig deeper. Then, you can also get a dedicated reply. No matter what you ask, LaMDA will try to limit the conversation to this topic related to dogs. If I start asking cricket-related questions, the model might bring the conversation back to the dog in an interesting way.

It's a tough challenge to keep on track, and it's an important area of research to build useful applications using language models.

These experiences with AI Test Kitchen demonstrate the potential of language models to help us plan, understand the world, and accomplish many other things.

Of course, there are some significant challenges that need to be addressed before these models are truly useful. While we have improved safety, the model may still generate inaccurate, inappropriate, or offensive responses. That's why we actively invite users to provide feedback so they can report issues.

We will do all of our work in accordance with the principles of Google AI. We will continue to iterate on LaMDA, gradually open it up over the next few months, and carefully and broadly assess stakeholder feedback — from AI researchers and social scientists to human rights experts. We will integrate this feedback into future versions of LaMDA and feel free to share our findings.

In the future, we plan to add other emerging AI areas to the AI Test Kitchen. You can learn more at g.com/AITestKitchen.

Making AI language models more powerful

LaMDA 2 has incredible conversational capabilities. In addition, to explore other aspects of natural language processing and AI, we recently published a new model, the Pathways Language Model (PaLM). This is the largest model we've ever developed, trained on 540 billion parameters.

PaLM has breakthrough performance on many natural language processing tasks, such as generating code from text, answering math problems, and even explaining a joke.

PaLM achieves this by scaling up the model. When we combine this large-scale model with a new technique called a "chain-of-thought", the results are promising. The "chain of thought cues" allows the model to transform a multi-step problem into a series of intermediate steps.

Let's take an example of a math problem that requires reasoning. Often, we need to train a model with other questions and answers before using it, and then ask the question. In this example, the question is: How many hours are there in May? As you can see, the model does not give the correct answer.

In the "Chain of Thought Prompts", we feed the model a pair of "question-answer" and explain how the answer is derived. It's a bit like your teacher explaining step by step how to solve a problem. Now, if we ask the model "how many hours in May" or other related questions, it can give the correct answer and the solution process.

"Thinking Hint Chain" technology allows models to reason better and give more accurate answers

The "chain of thought cues" greatly improved the accuracy of PaLM, bringing it to the top of the line on several reasoning benchmarks, including math problems. We did all this without changing how the model was trained.

Plus, the powerful PaLM can do so much more. For example, there may not be enough information available in your language right now on the web. Even more frustrating is that the answer you're looking for may be out there somewhere, just not in a language you understand, and PaLM offers a new approach that promises to make knowledge more accessible to everyone.

Let me show an example where PaLM can answer questions in Bengali (a language spoken by 250 million people) just as we trained it with Bengali questions, and Bengali and English answers.

That's it, now we can start asking in Bengali: "What is the national anthem of Bangladesh?" By the way, the answer is "Amar Sonar Bangla" - and PaLM got it right. This is not surprising, as the answer is apparently to be found in Bengali sources.

You can also try questions for which you are unlikely to find relevant information in Bengali, such as: "What are the popular pizza toppings in New York?" The model again answered correctly in Bengali. Although exactly how "correct" its answer is, is likely to spark debate among New Yorkers.

Impressively, PaLM has never seen a translation between Bengali and English. We also never trained it how to answer questions or translate! The model combines all the features on its own to correctly answer the question in Bengali. We can extend these techniques to more languages and other complex tasks.

We are very optimistic about the potential of language models. One day, we hope we can answer more questions in any language users speak, making knowledge more accessible in Google Search and other Google tools.

Launched the world's largest open machine learning center

The progress we share today has only been possible without our continued innovation in infrastructure. We also recently announced that Google plans to invest $9.5 billion in data centers and offices across the United States.

One of our state-of-the-art data centers is located in Mays County, Oklahoma. I'm excited to announce that we're launching the world's largest open machine learning center for Google Cloud customers.

One of our state-of-the-art data centers in Mays County, Oklahoma, USA

This machine learning center has 8 Cloud TPU v4 chips, customized by Google, built on the network infrastructure that powers Google's largest neural model, and can provide nearly 9 x 1018 computing power, which can provide Google's customers with unprecedented ability to run complex models and workloads. We hope this will drive innovation in numerous fields, from medicine, logistics, to sustainability and more.

Speaking of sustainability, the machine learning center has achieved 90% carbon-free energy operations. This helps us achieve our goal of 24/7 carbon-free operations across all of our data centers and campuses by 2030, and we want to be the first major company to do so.

While investing in data centers, we're also working to innovate Google's mobile platform so that more data processing can happen locally on the device. Google's custom Google Tensor chip is an important step in this direction. The Pixel 6 and Pixel 6 Pro flagship phones already feature Google Tensor processors, which bring AI capabilities directly into your phone, including the best speech recognition technology we've applied. It's also a big step toward the goal of "making devices more secure." Combined with Android's Private Compute Core, this technology can run data-driven functions directly on the device, protecting your privacy.

Every day, people turn to our products for help, whether it's a moment of critical or trivial importance. The key to making this possible is protecting users' private information at every step. While technology is becoming more sophisticated, our products are secure, designed to protect privacy and give users control, so we are way ahead of others in keeping more people around the world safe online farther.

We also shared today updates to platforms like Android that are providing access, connectivity and information to billions of people through smartphones and other connected devices like TVs, cars, and watches.

We also shared the latest in the Pixel lineup, including the Pixel 6a, Pixel Buds Pro, Google Pixel Watch, Pixel 7, and Pixel tablets, all incorporating ambient computing designs. We're also excited to better assist users with a range of devices.

New Frontier of Computing Technology - Augmented Reality Technology

Today, we talk about all the technologies that are changing us, the way we use computers and the way we acquire knowledge. Whenever and wherever we need it, we can find interconnected, closely-collaborating devices. And the blessing of the dialogue interface is more convenient to complete the task.

Looking to the future, there is a new front in the field of information technology that has the potential to drive the continued development of existing technologies, and it is augmented reality (AR). Google is investing heavily in AR: we've referenced AR in many products, including Google Lens, multisearch, scene exploration, and Google Maps' Live Views and Immersive Views capabilities.

AR features have been applied to mobile terminals, and the magic is that it can bring us the most realistic and natural experience in the real world, as if we are not using technology.

The most exciting thing is the potential of AR, that is, it can make us pay attention to the real world, real life. You know, the world we live in is inherently wonderful!

It is critical that we create and design based on the real world and never break away from reality. AR is precisely the new method that can help us realize this design philosophy.

Take language as an example, language is the basis of communication between people. However, if the other party speaks another language, or if one party to the conversation is hard of hearing, communication becomes difficult. Let's see what happens when we apply the latest technology to translation and language transcription and present it in an early test prototype.

In this video , people can be seen communicating with others naturally and smoothly, with joy on their faces. Understanding and being understood, that moment of connection is what we focus on in terms of knowledge and computer technology, what we help people achieve through our products, and what we strive for every day.

Every year, we are moving towards our ultimate mission, and there is still a long way to go. Google is really excited about this! We are optimistic and confident that our breakthroughs will lead us to our mission. Thanks to all participating developers, partners and consumers. We look forward to building the future together with you all.


Android开发者
404 声望2k 粉丝

Android 最新开发技术更新,包括 Kotlin、Android Studio、Jetpack 和 Android 最新系统技术特性分享。更多内容,请关注 官方 Android 开发者文档。


引用和评论

0 条评论