Vector Database Engineer with Python and XML data experience - Upwork
Need a Python expert familiar with large XML and DTD files, especially bulk data downloaded from the US Patent Office website. The Python expert needs to parse the large XML files, vectorize it using standard techniques using NLTK or similar. The expert then needs to upoad the vectorized data into Pinecone for further searching. The data comprises 20 years of patent data. Please use parsers others have developed specifically for USPTO data https://github.com/lettergram/parse-uspto-xml/tree/master/config https://github.com/TamerKhraisha/uspto-patent-data-parser/blob/master/uspto.py Once the vectorized data has been uploaded into Pinecone the job may continue with developing "fine-tuning" chatgpt models.Hourly Range: $35.00-$70.00 Posted On: April 26, 2023 04:27 UTCCategory: Database DevelopmentSkills:Data Migration, Redis, Python, XML, NLTK Country: United States click to apply
Daftar Isi
Need a Python expert familiar with large XML and DTD files, especially bulk data downloaded from the US Patent Office website.
The Python expert needs to parse the large XML files, vectorize it using standard techniques using NLTK or similar. The expert then needs to upoad the vectorized data into Pinecone for further searching.
The data comprises 20 years of patent data.
Please use parsers others have developed specifically for USPTO data
https://github.com/lettergram/parse-uspto-xml/tree/master/config
https://github.com/TamerKhraisha/uspto-patent-data-parser/blob/master/uspto.py
Once the vectorized data has been uploaded into Pinecone the job may continue with developing "fine-tuning" chatgpt models.
Hourly Range: $35.00-$70.00
Posted On: April 26, 2023 04:27 UTC
Category: Database Development
Skills:Data Migration, Redis, Python, XML, NLTK
Country: United States
click to apply