This screencast demonstrates how to use Google AI Studio and Gemini 1.5 Pro to extract structured book data from bookshelf photos. By leveraging OCR (Optical Character Recognition) and AI-powered data extraction, you can convert book titles and authors into structured JSON format, then transform it into CSV for easy spreadsheet use. You can then apply this same idea to other things.
I recorded a screencast demonstrating how to extract structured book data from images using Gemini 1.5 Pro.
The following images by Brett Jordan, Unsplash were used in the experiment:
I uploaded a photo of a bookshelf to Google AI Studio.
I used the following prompt to guide Gemini 1.5 Pro in extracting book details:
You are an expert data extractor and librarian. Your task is to analyze images of bookshelves and extract information about the books.
Example Output Format:
[
{
"title": "The Lord of the Rings",
"author": "J.R.R. Tolkien",
"subject": "Fantasy"
},
{
"title": "Pride and Prejudice",
"author": "Jane Austen",
"subject": "Classic Literature"
},
{
"title": "A Brief History of Time",
"author": "Stephen Hawking",
"subject": "Physics"
},
{
"title": "Some Book with a title that does not give away the subject",
"author": "Some Author",
"subject": "Unknown"
}
]
Please analyze the provided image and generate the JSON output.
Gemini 1.5 Pro processed the image and returned structured JSON data:
[
{
"title": "Meat",
"author": "Pat LaFrieda",
"subject": "Cooking"
},
{
"title": "Simply Beautiful Photographs",
"author": "Griffiths",
"subject": "Photography"
},
{
"title": "Simple",
"author": "Diana Henry",
"subject": "Cooking"
},
{
"title": "Salt is Essential",
"author": "Shaun Hill",
"subject": "Cooking"
},
{
"title": "gather",
"author": "Gill Meller",
"subject": "Cooking"
},
{
"title": "What to Cook & How to Cook it",
"author": "Jane Hornby",
"subject": "Cooking"
},
{
"title": "Rick Stein's India",
"author": "Rick Stein",
"subject": "Cooking, Travel"
},
{
"title": "The Great British Bake Off",
"author": "Linda Collister",
"subject": "Cooking"
},
{
"title": "Perfect Cakes & Bakes to Make at Home",
"author": "Linda Collister",
"subject": "Cooking"
},
{
"title": "Recharge Your Design Batteries",
"author": "John Norris",
"subject": "Design"
},
{
"title": "The Complete Guide to Stretching",
"author": "Christopher M. Norris",
"subject": "Health, Fitness"
},
{
"title": "I'm Not Your Friend, I'm Free",
"author": "Sam McBratney, Kim Lewis",
"subject": "Unknown"
},
{
"title": "When from Death",
"author": "John Rawson",
"subject": "Unknown"
},
{
"title": "An Illustrated Life",
"author": "Danny Gregory",
"subject": "Art, Inspiration"
}
]
Step 4: Convert JSON to CSV Once I had the JSON data, I converted it into a CSV file for use in spreadsheets. Here’s the Python script I used:
import pandas as pd
# Load the JSON file into a DataFrame
df = pd.read_json('books.json')
# Save as CSV
df.to_csv('books.csv', index=False)
print("books.json has been successfully converted to books.csv")
Now, I could open books.csv in Google Sheets or Excel for further analysis.
Applying This Process to Other Real-World Use Cases This method of OCR and structured data extraction can be extended beyond books. Here are some real-life applications:
Inventory Management Retailers and warehouses can take shelf photos to automatically extract product names, brands, and barcodes, reducing manual data entry.
Logistics & Package Tracking Shipping companies can scan package labels and extract tracking numbers, recipient names, and addresses into a structured format.
Healthcare & Prescription Management
Pharmacists can digitize prescription labels, extracting patient names, medications, and dosage instructions for better record-keeping. Conclusion Using Google AI Studio and Gemini 1.5 Pro, you can:
Extract structured data from images using OCR.
Convert JSON to CSV for easier spreadsheet use. Apply this method to real-world applications in retail, logistics, and healthcare. This is just one example of how AI can automate tedious tasks and improve efficiency in data processing. AI is changing every day…
Try extracting structured data using Google AI Studio.
Convert the JSON output into CSV for analysis.
Think about how this method can be applied to other industries.🧠
Let me know if you try it, or if you figure something else out!! ⭐