fast processing: changing the column value in a geospatial environment

0

I have a table whose header looks like this: complaint_type, borough, street_name, incident_zip, latitude, longitude

1) I want to check if the "incident_zip" column of each row is in a specific list of zip codes and change the "borough" accordingly. There is a large amount of data and i cannot find any better code to do this. I am using python 3.6. I want to change the borough where it is "unspecified". I used if statements along with replace but it is taking a lot of processing time (more than 3hrs) that I have to stop the kernel. There are five long lists of zip codes.

2) Is there any other way to update the "borough" column like latitude and longitude?

Platinum

Posted 2019-11-10T12:28:30.497

Reputation: 11

So you have zip code to borough mapping? – Yohanes Alfredo – 2019-11-10T12:39:00.970

Answers

1

This is to get whether the incident_zip is in the list of zip code.

df.isin({'incident_zip' : zip_code_list)

If you have a zip code borough mapping in the form of dict. You can do

df['borough'].map(zip_to_borough)

For pandas dataframe never iterate say by rows unless as last resort because the performance will be terrible.

You can speed these operation multiprocessing if you want.

Yohanes Alfredo

Posted 2019-11-10T12:28:30.497

Reputation: 928