Task: describe data structure in web source (mainly JSONs) and look for specific data.
Country: US.
Industry: health care insurance plans.
As of July 1, 2022, most group health plans and issuers of group or individual health insurance coverage are required to disclose, on a public website, machine-readable files containing in-network rates for covered items and services, and allowed amounts and historical billed charges for out-of-network providers.
There are many insurance companies but we will start with one: Centene.
We need the records where CPT codes are either 92227, 92228, 92250 or 92229.
CPT codes are codes that describe medical procedures.
They probably appear in headers that contain "cpt", "billing code" etc.
They are numeric and alphanumeric in nature.
We need a count of the rows where any of the four codes exist.
We are after in-network data only.
Data is stored in URLs that point to JSONs or compressed JSONs.
If a list of JSONs includes duplicates - provide data only on unique JSONs.
Centene's machine-readable files (MRFs) are here: https://www.centene.com/price-transparency-files.html.
We will start with the 19 items under "Links to the download TOC files". Ignore what's under "Centene Employees".
Note that there is a hierarchical structure. It should be expressed in the final product.
Report in JSON file.
Remember to enter the data while keeping the original structure (hierarchy) of the URLs - links or URLs.
I have created a JSON which contains a partial mapping of "Index File for Ambetter Plans" to make things much clearer.
Where I haven't extracted text data that needs to be extracted by you, I entered "PLEASE ADD".
Where I haven't extracted numeric data that needs to be extracted by you, I entered "99999".
Use Python to perform the task and provide us with the code.

Hourly Range: $17.00-$40.00
Posted On: March 16, 2023 08:24 UTC
Category: Data Extraction
Skills:Python, JSON, Data Scraping, pandas
Country: Israel
click to apply