The Google Research team has announced that they will soon be adding a feature called Ask Photos, powered by Gemini AI, to Google Photos. They also explained how it will function.
According to the team, Ask Photos is a strong example of how Gemini models can work as intelligent agents using their memory capabilities.
They said: ”Sample queries Google has provided outside the on-stage announcement include: ‘Show me the best photo from each national park I’ve visited’, ‘What themes have we had for Lena’s birthday parties?’. Your conversational query is “passed to an agent model that uses Gemini to determine the best retrieval augmented generation (RAG) tool for the task.”
Usually, the agent model begins by understanding what the user is looking for and then searches for photos using an improved vector-based system. This system enhances the powerful metadata search in Google Photos and is much better at understanding simple language concepts rather than just keywords.
When Gemini can look at both photos and videos, its ability to understand context over longer periods and across different types of media improves. This helps it find the most relevant information, whether it’s visual content, text, dates, or locations that need to be used.
“In the end, the answer model creates a helpful reply using both videos and carefully studied photos. Also, the upcoming Ask Photos will remember all this information for future chats. It’s more than just a search tool—it will help you in many ways and provide an easy-to-use search experience.”