Media Metadata Consultant - AI Vision Tooling & Data Extraction

Budget: $40.0 - $80.0 HOURLY / NOT_SURE ⭐ 4.40 (1) United States

python, json, product-research

Summary We are a US-European team developing an early-stage iOS product focused on helping people retrieve, organize, review, and reflect on travel media albums. More information about the product in a section below. We are looking for a technical product research consultant with experience in photo metadata, AI image analysis, geolocation-related data, lightweight pipeline prototyping and database structure definition based on project requirements. This is a short discovery and architecture review project. We are not looking for a full production build at this stage. The initial engagement is planned to be up to 40 hours, with possible follow-up work if the collaboration is successful. Expectation is weekly 1 hour meetings and a brief written progress report. ________________________________________ The Work To Be Done We need a consultant to help us evaluate practical approaches for working with phone-based media data and related metadata that can be either obtained or inferred. The work will include researching, testing, and recommending frameworks, APIs, tools and pipelines for: 1. Evaluate AI vision tools for image labeling, scene recognition, landmark detection, OCR, and other semantic tagging that can be integrated into the data extraction pipeline. The tools suggested must be explained, compared and their systematic and scalability implications need to be defined. 2. Extract and infer metadata from Google Photos / Google Takeout, Apple Photos, and iPhone media exports. Data extracted is useful if it provides relevant information and richness to the context of the media element in isolation and in relation to other media elements the user might have. 3. Recommend methods to group travel photos into reasonable media sets or “moments” using time, location, image content, metadata, inferred data, relational data and user manual input. 4. Recommend database structure approaches for evolvable context rich storage of media and media groups. 5. Evaluate and describe the option and value of working with EXIF, XMP, JSON, CSV, and related data structures through an Expo / React Native codebase. 6. Help our team understanding what media metadata is reliable, what is inconsistent, what might be missing and for each of those the reason and implications for context enrichment. 7. Recommend a practical prototype path using existing tools, APIs, frameworks or lightweight custom pipelines 8. Identify and discuss privacy implications, licensing, financial costs, and technical constraints as the size and number of media elements processed scales as a result of more users and more usage of each 9. Evaluate geolocation and reverse-geocoding approaches for converting coordinates into useful place information that adds valuable data to enrich the context of the media element, and to its context in relation to others. *Note: the list ranks the suggested focus for the project, if time constraints do not allow for covering all, priorities will be revised. We want someone who understands the existing tool landscape and can help us choose a practical path for integrating data extraction technologies into a workflow that enriches contextual information about media groups, understanding what data handling and storage approaches are best based on project requirements. ________________________________________ Expected Output At the end of the project, we expect: • A concise discovery report • A comparison table of relevant tools, APIs, frameworks and workflows • A recommended technical approach for a next-stage prototype of the media data workflow for extracting data and using it in the app • A minimal working demonstration using a small sample dataset • Notes on privacy, data handling, licensing, cost, and implementation risks The working demonstration does not need to be production code. It may include a script, spreadsheet, notebook, API test, or structured example showing how the recommended workflow could operate. ________________________________________ Ideal Candidate You may be a good fit if you have experience with several of the following: • Media metadata extraction • EXIF / XMP / IPTC metadata • Google Takeout or Google Photos metadata • Apple Photos or iPhone photo exports • ExifTool or similar metadata tools • AI vision APIs or image labeling systems • Object detection, landmark detection, OCR, or semantic image tagging • Geolocation, reverse geocoding, map APIs, or place data • Python, CSV, JSON, API testing, or lightweight data pipelines • Technical product research or MVP architecture review • Privacy-sensitive handling of personal media and data • Data driven React Native and Expo apps • SQLite and Supabase, local and cloud databases • Familiarity with vector databases or AI-queryable context stores (e.g. pgvector) • Migration from SQL only databases to vector databases Our ideal candidate will have broad personal travel experience, photography/vlogging experience, or experience with travel/photo apps is preferred. We want someone who can think about both the technical workflow and what would actually be useful to a traveler after a trip. English proficiency is a requirement. ________________________________________ About The Project We are building a mobile travel journal app (React Native / Expo, iOS-first, currently in TestFlight beta) that helps travelers organize their trip photos into a meaningful, easy-to-revisit record. Instead of leaving hundreds of photos scattered and unsorted, the app uses available photo metadata (timestamps, location, sequence) to group images into trips and distinct moments, then lets users add short context (a caption or a title) to enrich the context of those moments. The output is a polished, shareable graphic artifact generated from the trip. We've built a first version of the photo grouping logic already, using basic metadata signals to cluster images into likely moments. We're now looking to go deeper: smarter use of EXIF/location data, better handling of missing or inconsistent metadata, and an evaluation of where AI vision tools (scene/landmark recognition, OCR, etc.) could meaningfully improve grouping quality without adding incrementally unnecessary cost or complexity. The current architecture is local-first on-device storage, with Supabase being introduced as a lightweight analytics/identity layer (not yet the primary data store), so any recommendations need to account for that current constraint rather than assuming that a full cloud migration is already in place. ________________________________________ Privacy / NDA An NDA will be required before we share private project materials or sample data. The contractor may not upload any personal or user-supplied images to third-party AI tools, model-training systems, public datasets, or cloud services without written approval. The contractor may not distribute private project materials, information or details in any way without previous authorization. ________________________________________ Engagement Details • Project type: short discovery sprint • Contract type: hourly • Rate range: $40—$80 / hour • Estimated time: up to 40 hours over 4 weeks • Possible continuation depending on results and fit ________________________________________ To Apply Please include: 1. Experience, Tools & Proposed Approach (1-2 page max): Outline what practical and technical expertise you have with projects and experiences that are relevant to the scope mentioned above. Describe how you'd structure this 40 hour sprint to accomplish the points defined above. 2. Your hourly rate and availability (start date, hours per week, time zone, etc.) 3. Any privacy or data-handling concerns you would want us to consider

Öppna på Upwork