Soil data standardization and harmonization

We use LimoSoDa, a collection of open-source farm-level soil datasets, as an example to demonstrate how our soil data harmonization workflow operates, including both standardization and harmonization steps.

In this example, we work with two LimoSoDa subsets:

This covers three soil properties: - Soil Organic Carbon (soc) - pH measured in water (ph.h2o) - Clay content (mass percentage of particles < 2 µm)

Standardization

This step collects, standardizes, and unifies soil datasets from multiple formats (JSON, SHP, Excel, CSV, DBF).
For each data source, a notebook named {dataset_id}_standardize.ipynb performs the full standardization pipeline.
Metadata required for the later harmonization step is manually added in {dataset_id}_{subset_id}_meta.json.


Extracting Metadata into a Standard JSON File

Soil property metadata often appears in inconsistent formats and may contain varying levels of detail.
To standardize and automate the workflow, we define a set of metadata fields of interest and manually extract them into a JSON file. These metadata is required for downstream harmonization steps, including:

Field Description
dataset_id Project or data source name (e.g., LUCAS).
subset_id Subset of the dataset (e.g., LUCAS 2009).
site_id Identifier for individual measurements within the dataset.
country Country where the data was collected.
lat / lon Sample location coordinates (WGS84).
time Sampling time.
hzn_dep / hzn_top & hzn_btm Reported soil horizon depth (single value or top/bottom).
lc_survey Land cover information collected along with soil samples.

Apart from the general ones above, there are also metadata for soil property measurements — specifically units and measurement methods — can appear in two forms:

1. Uniform Property Metadata (most common): The entire dataset subset shares the same units and same measurement method, such as LUCAS, LimeSoDa. In this case, unit and method information is recorded once in the JSON metadata file. The processing script can then automatically apply this shared information. For SOC, an additional metadata field - som? - is needed to indicate whether SOC is reported as SOM. If so, a simple scaling factor is applied to convert SOM to SOC.

2. Row-Specific Property Metadata: For some datasets, each row has its own unit, method, or measurement specification. The JSON fields for unit/method are set to null. Additional manual processing is required during standardization to extract row-level metadata.

In the end, we will get a json file like this:

Example json file “LimoSoDa_NSW.52_meta.json”

The standardization ETL

ETL Process

Once the metadata JSON file is prepared, we can start the standardization ETL process using code. In the following, we walk through each step of this process.

To start, we import the required packages and define the relevant file paths. To keep the project portable, all file references use relative paths rather than absolute paths. This ensures the code runs correctly regardless of where the project folder is located.

We organize our project so that code and data live in separate directories:

Directory structure
Code
import os  # for file parsing
import pandas as pd # for data frame processing
import numpy as np # for numerical calculation
import json # for meta data extraction
from pyproj import Transformer # for coordinates standardization

parent_path = 'example_data' # directory path of your data
raw_folder = f'{parent_path}/data_raw/LimeSoDa/data' # folder with raw data
standard_folder = f'{parent_path}/data_standardized' # folder that store standardized data

Standardize the LimeSoDa subset NSW.52 so that it follows a consistent format and includes all required metadata columns.

Define version, dataset_id, and subset_ids to keep your input and output paths well-organized and structured.

These identifiers allow you to systematically build directory paths and manage data across different stages of processing. Using these paths, the metadata JSON, the main datasets, and the coordinate datasets are loaded.

Code
version = 'v20250723' # version id, can be anything, here we use date string
dataset_id = "LimoSoDa" # dataset id
subset_id = "NSW.52" # id for 1st subset in this demonstration

# load in the datasets and coordinates
data = pd.read_csv(f'{raw_folder}/{subset_id}/{subset_id}_dataset.csv')
coordinate = pd.read_csv(f'{raw_folder}/{subset_id}/{subset_id}_coordinates.csv')
df = pd.concat([data, coordinate], axis=1).copy() # join  the coordinates and dataset
ori_len = len(df) # original measurement number
print("the raw data looks like\n", df.head(3))

with open(f'{raw_folder}/{subset_id}/{dataset_id}_{subset_id}_meta.json', 'r') as f:
    meta = json.load(f)    
crs = meta.get("crs")
print("\nthe meta data looks like\n",meta)
the raw data looks like
    SOC_target  pH_target  Clay_target    Altitude     Slope   B02   B8A   B11  \
0        1.00       6.70       55.771  243.794998  0.026625  1472  2262  2857   
1        0.85       8.58       55.538  247.552994  0.019499  1585  2453  3149   
2        0.75       8.75       58.212  241.613998  0.042002  1484  2150  2629   

       x_32755      y_32755  
0  774457.5725  6648441.945  
1  775237.4110  6648541.716  
2  774267.5263  6648114.752  

the meta data looks like
 {'dataset_id': 'LimeSoDa', 'subset_id': 'NSW.52', 'crs': 'EPSG:32755', 'time': '2018.07,2018.12', 'depth': '0-10', 'country': 'Australia', 'properties': {'soc': {'unit': '%', 'method': 'walkley-black'}, 'clay': {'unit': '%', 'method': 'Sieve-Pipette method, sand by sieving, silt&clay by sedimentation in water (Gee and Bauder 1986)'}, 'ph.h2o': {'unit': None, 'method': 'a glass electrode with a 5:1 liquid:soil volumetric ratio'}}}

Observations from the Raw Data

From the initial inspection, the dataset requires several standardization steps:

  • Column names must follow the agreed naming convention (e.g., use soc instead of SOC_target).
  • Coordinate fields are stored in a local projection (e.g., x_32775) instead of geographic coordinates (latitude, longitude).
  • Some essential metadata is missing (e.g., dataset_id, depth, etc.).
  • Some fields are irrelevant and should be removed (e.g., Altitude).

To address these issues, we will standardize each component step by step. Many tasks can be automated using functions from standardize_helpers, while others require manual adjustments depending on the dataset.

Before applying helper functions, the following tasks must be handled manually:

  1. Determine pH Measurement Method
    Check the metadata to verify whether pH was measured in water or CaCl₂, as this affects how the values should be interpreted and standardized.

  2. Rename Property Columns
    Update property-related column names to follow the standardized schema.

  3. Identify and Convert Coordinate Columns
    Detect the coordinate columns provided in local projection and convert them into the required geographic coordinate system.

Code
# rename the columns to be standard
df = df.rename(columns={'SOC_target':'soc', 'Clay_target':'clay', 'pH_target':'ph'})

# pH
if "ph" in df.columns:
    if "ph.h2o" in meta.get("properties", {}):
        df.rename(columns={"ph": "ph.h2o"}, inplace=True)
    elif "ph.cacl2" in meta.get("properties", {}):
        df.rename(columns={"ph": "ph.cacl2"}, inplace=True) 

# coordinates conversion: lat and lon
x_cols = [c for c in df.columns if c.startswith("x_")]
y_cols = [c for c in df.columns if c.startswith("y_")]

if x_cols and y_cols:
    x_col, y_col = x_cols[0], y_cols[0]
    if crs and crs.lower() != "epsg:4326":
        transformer = Transformer.from_crs(crs, "EPSG:4326", always_xy=True)
        df["lon"], df["lat"] = transformer.transform(df[x_col].values, df[y_col].values)
    elif crs:
        df.rename(columns={x_col: "lon", y_col: "lat"}, inplace=True)
    else:
        df["lat"] = np.nan
        df["lon"] = np.nan
else:
    df["lat"] = np.nan
    df["lon"] = np.nan    
    
print("now the data contains columns\n", df.columns.values)
now the data contains columns
 ['soc' 'ph.h2o' 'clay' 'Altitude' 'Slope' 'B02' 'B8A' 'B11' 'x_32755'
 'y_32755' 'lon' 'lat']

By applying the attach_metadata(df, meta) function, metadata is extracted from the metadata JSON and appended to the dataframe. Specifically, this function performs the following tasks:

  • Add general metadata such as dataset_id, subset_id, time, country, depth, and lc_survey, as well as property-level metadata (e.g., units and measurement methods) directly to the dataframe, provided these values are available in the metadata file.
    In our case, most of these fields are present in the metadata.

  • Add an additional column soc_som, based on the som? parameter in the metadata, indicating whether SOC values were originally reported as SOM and therefore require conversion using a simple scaling factor.

  • Handle missing or non-uniform metadata.
    If a field is not included in the metadata, it means either (a) it is missing entirely, or (b) it varies within the dataset and must be added manually.
    For this dataset, all required metadata are available except lc_survey, which remains unknown.

After attaching metadata, irrelevant columns are removed to keep the dataset clean and consistent using the standardize_column_types(df) function.

These steps are commonly needed for most datasets, which is why they have been implemented as helper functions—to reduce repetitive manual work.

Code
from standardize_helper import attach_metadata, standardize_column_types # helpers tailored

# attach meta data from meta.json to the df
# could also pass the path to meta.json here
df = attach_metadata(df, meta) 
print("now the data contains columns\n", df.columns.values)
print()
df, mismatch = standardize_column_types(df)
print("now the data contains columns\n", df.columns.values)
now the data contains columns
 ['soc' 'ph.h2o' 'clay' 'Altitude' 'Slope' 'B02' 'B8A' 'B11' 'x_32755'
 'y_32755' 'lon' 'lat' 'dataset_id' 'subset_id' 'time' 'country' 'depth'
 'hzn_dep' 'hzn_top' 'hzn_btm' 'soc_unit' 'soc_method' 'clay_unit'
 'clay_method' 'ph.h2o_method']

Missing meta column: site_id, double check if you could add it...
Missing meta column: lc_survey, double check if you could add it...
Dropping unexpected columns: ['Altitude', 'B02', 'B11', 'B8A', 'Slope', 'depth', 'x_32755', 'y_32755']
now the data contains columns
 ['soc' 'ph.h2o' 'clay' 'lon' 'lat' 'dataset_id' 'subset_id' 'time'
 'country' 'hzn_dep' 'hzn_top' 'hzn_btm' 'soc_unit' 'soc_method'
 'clay_unit' 'clay_method' 'ph.h2o_method']

Once the data is standardized, we save the processed dataset.

Up to this point, no measurements (rows) have been removed, regardless of their quality.
All filtering or quality-control steps will be handled later in the workflow.

Code
now_len = len(df)
print(f"{ori_len-now_len} measurements missing during the standardization process.")

df.to_parquet(standard_folder+f"/{dataset_id}_{subset_id}_standardized_{version}.pq")
0 measurements missing during the standardization process.

Now we repeat the process for another subset: BB.51.

Code
subset_id = "SC.50" # id for 1st subset in this demonstration

# load in the datasets and coordinates
data = pd.read_csv(f'{raw_folder}/{subset_id}/{subset_id}_dataset.csv')
coordinate = pd.read_csv(f'{raw_folder}/{subset_id}/{subset_id}_coordinates.csv')
df = pd.concat([data, coordinate], axis=1).copy() # join  the coordinates and dataset
ori_len = len(df) # original measurement number
print("the raw data looks like\n", df.head(3))

with open(f'{raw_folder}/{subset_id}/{dataset_id}_{subset_id}_meta.json', 'r') as f:
    meta = json.load(f)    
crs = meta.get("crs")
print("\nthe meta data looks like\n",meta)
print()

# rename the columns to be standard
df = df.rename(columns={'SOC_target':'soc', 'Clay_target':'clay', 'pH_target':'ph'})

# pH
if "ph" in df.columns:
    if "ph.h2o" in meta.get("properties", {}):
        df.rename(columns={"ph": "ph.h2o"}, inplace=True)
    elif "ph.cacl2" in meta.get("properties", {}):
        df.rename(columns={"ph": "ph.cacl2"}, inplace=True) 
        
# coordinates conversion: lat and lon
x_cols = [c for c in df.columns if c.startswith("x_")]
y_cols = [c for c in df.columns if c.startswith("y_")]

if x_cols and y_cols:
    x_col, y_col = x_cols[0], y_cols[0]
    if crs and crs.lower() != "epsg:4326":
        transformer = Transformer.from_crs(crs, "EPSG:4326", always_xy=True)
        df["lon"], df["lat"] = transformer.transform(df[x_col].values, df[y_col].values)
    elif crs:
        df.rename(columns={x_col: "lon", y_col: "lat"}, inplace=True)
    else:
        df["lat"] = np.nan
        df["lon"] = np.nan
else:
    df["lat"] = np.nan
    df["lon"] = np.nan    
    
print("now the data contains columns\n", df.columns.values)
print()

# attach meta data from meta.json to the df
# could also pass the path to meta.json here
df = attach_metadata(df, meta) 
print("now the data contains columns\n", df.columns.values)
print()
df, mismatch = standardize_column_types(df)
print("now the data contains columns\n", df.columns.values)

df.to_parquet(standard_folder+f"/{dataset_id}_{subset_id}_standardized_{version}.pq")
the raw data looks like
    SOC_target  pH_target  Clay_target     Altitude     Slope        ERa  \
0       2.209        5.8           68  1024.848022  2.736778  37.537538   
1       2.267        5.9           68  1025.331909  2.266586  35.001750   
2       2.093        5.6           66  1027.451660  1.142883  66.006601   

      x_32722      y_32722  
0  542488.019  6975548.675  
1  542487.752  6975499.938  
2  542451.559  6975474.032  

the meta data looks like
 {'dataset_id': 'LimeSoDa', 'subset_id': 'SC.50', 'time': '2013.11', 'country': 'Brazil', 'crs': 'EPSG:32722', 'depth': '0-20', 'properties': {'soc': {'unit': '%', 'method': 'walkley-black', 'som?': 0}, 'clay': {'unit': '%', 'method': 'Sieve-Pipette method, sand by sieving, silt&clay by sedimentation in water, German adaptation (DIN ISO 11277)'}, 'ph.h2o': {'unit': None, 'method': 'a glass electrode with a 5:1 liquid:soil volumetric ratio'}}}

now the data contains columns
 ['soc' 'ph.h2o' 'clay' 'Altitude' 'Slope' 'ERa' 'x_32722' 'y_32722' 'lon'
 'lat']

now the data contains columns
 ['soc' 'ph.h2o' 'clay' 'Altitude' 'Slope' 'ERa' 'x_32722' 'y_32722' 'lon'
 'lat' 'dataset_id' 'subset_id' 'time' 'country' 'depth' 'hzn_dep'
 'hzn_top' 'hzn_btm' 'soc_unit' 'soc_method' 'clay_unit' 'clay_method'
 'ph.h2o_method']

Missing meta column: site_id, double check if you could add it...
Missing meta column: lc_survey, double check if you could add it...
Dropping unexpected columns: ['Altitude', 'ERa', 'Slope', 'depth', 'x_32722', 'y_32722']
now the data contains columns
 ['soc' 'ph.h2o' 'clay' 'lon' 'lat' 'dataset_id' 'subset_id' 'time'
 'country' 'hzn_dep' 'hzn_top' 'hzn_btm' 'soc_unit' 'soc_method'
 'clay_unit' 'clay_method' 'ph.h2o_method']

Combine standardized datasets

Now we have all the example datasets standardized and put in directory data_standardized as shown in

Standardized datasets

Next we combine them together, and clean based on meta data.

Code
data = []
for filename in os.listdir(standard_folder):
    if filename.endswith(f"_standardized_{version}.pq"):
        filepath = os.path.join(standard_folder, filename)
        print(f"Adding: {filepath}")
        dt = pd.read_parquet(filepath)
        data.append(dt)
        print(dt.shape)
        print('------------')

df_standardized = pd.concat(data)
print('standardized data shape', df_standardized.shape)
print("missing data check\n%, records, meta")
for ii in ["dataset_id", "subset_id", "time", "site_id", "country", "lat", "lon", "lc_survey","hzn_top","hzn_btm"]:
    if ii in df_standardized.columns:
        print(round(df_standardized[ii].isna().sum()*100/len(df_standardized),1), df_standardized[ii].isna().sum(), ii)
Adding: example_data/data_standardized/LimoSoDa_NSW.52_standardized_v20250723.pq
(52, 17)
------------
Adding: example_data/data_standardized/all_standardized_v20250723.pq
(50, 59)
------------
Adding: example_data/data_standardized/LimoSoDa_SC.50_standardized_v20250723.pq
(50, 17)
------------
Adding: example_data/data_standardized/sharable_standardized_v20250723.pq
(1219298, 59)
------------
standardized data shape (1219450, 62)
missing data check
%, records, meta
0.0 0 dataset_id
0.0 0 subset_id
100.0 1219348 time
0.0 152 site_id
0.0 0 country
0.0 0 lat
0.0 0 lon
0.0 152 lc_survey
100.0 1219348 hzn_top
100.0 1219348 hzn_btm

We now clean the data based on metadata: records lacking valid depth, year, or coordinate information are removed. Then we save the data.

We also standardize: - Depth into a single column, hzn_dep. If hzn_dep is missing, it is computed as the mean of hzn_top and hzn_btm. - Year information, extracted and standardized from the original time column (since current covariate layers support year-level resolution).

Code
# calculate hzn_dep
mask = df_standardized['hzn_dep'].isna()
df_standardized.loc[mask, 'hzn_dep'] = df_standardized.loc[mask, ['hzn_top', 'hzn_btm']].mean(axis=1)

# check depth information validity
na = df_standardized.loc[df_standardized['hzn_dep'].isna()]
df_standardized = df_standardized.loc[df_standardized['hzn_dep'].notna()].reset_index(drop=True)
print(f'{len(na)} data with no depth info, from ', na['dataset_id'].unique())

df_standardized = df_standardized.drop(columns=['hzn_top','hzn_btm'])

# time
na = df_standardized.loc[df_standardized['time'].isna()]
df_standardized = df_standardized.loc[df_standardized['time'].notna()].reset_index(drop=True)
print(f'{len(na)} data with no time info, from ', na['dataset_id'].unique())

# extract year
from standardize_helper import extract_year
df_standardized["year"] = df_standardized["time"].map(extract_year)

na = df_standardized.loc[df_standardized['year'].isna()]
df_standardized = df_standardized.loc[df_standardized['year'].notna()].reset_index(drop=True)
print(f'{len(na)} data with no valid year info, from ', na['dataset_id'].unique())
df_standardized = df_standardized.drop(columns=['time'])

# coordinates
na = df_standardized.loc[df_standardized['lat'].isna() | df_standardized['lon'].isna()]
df_standardized = df_standardized.loc[df_standardized['lat'].notna() | df_standardized['lon'].notna()].reset_index(drop=True)
print(f'{len(na)} data with nan coordinate info, from ', na['dataset_id'].unique())

mask_inf = np.isinf(df_standardized["lat"]) | np.isinf(df_standardized["lon"])
na = df_standardized.loc[mask_inf]
df_standardized = df_standardized.loc[~mask_inf].reset_index(drop=True)
print(f'{len(na)} data with inf coordinate info, from ', na['dataset_id'].unique())

print(df_standardized.shape)
df_standardized.to_parquet(f'{standard_folder}/all_standardized_{version}.pq')
0 data with no depth info, from  []
1219348 data with no time info, from  ['LimeSoDa' 'BHR-P' 'BIS' 'GLanCE' 'Geocradle' 'INFOSOLO' 'LUCAS' 'MarSOC'
 'SoDaH']
52 data with no valid year info, from  ['LimeSoDa']
0 data with nan coordinate info, from  []
0 data with inf coordinate info, from  []
(50, 59)

Harmonization ETL

This step aligns measurement values across datasets so they are comparable with the LUCAS reference methods.

It uses the standardized dataset all_standardized_{version_code}.pq as input and harmonizes property values by accounting for both method and unit differences.

ETL process for harmonization

Method harmonization ensures that property values measured using different protocols become comparable. This process relies on the shared harmonization sheet. This goes through steps:

  • For each combination of dataset_id and {prop}_method, register any unknown methods in the harmonization sheet using the register_method function from harmonization_helpers.py.
Code
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import warnings
warnings.filterwarnings("ignore")

df_standardized = pd.read_parquet(f"{standard_folder}/all_standardized_v20250723.pq")

scope = ["https://spreadsheets.google.com/feeds", "https://www.googleapis.com/auth/drive"]
key_json = f"{parent_path}/code/soilharmonization-60c3742e60d5.json"
creds = ServiceAccountCredentials.from_json_keyfile_name(key_json, scope)
client = gspread.authorize(creds)
sheet = client.open("AI4SoilHealth soil laboratory data harmonization info")

from harmonization_helper import register_methods, conversion
for prop in ['soc','ph.h2o','clay']:
    print(f'---------------------{prop}-----------------------')
    tab = sheet.worksheet(prop.replace('.','_'))
    tabt_values = tab.get_all_values()
    tab_df = pd.DataFrame(tabt_values[1:], columns=tabt_values[0])

    register_methods(tab, df_standardized, tab_df, prop, version)
---------------------soc-----------------------
No new methods to register.
---------------------ph.h2o-----------------------
No new methods to register.
---------------------clay-----------------------
No new methods to register.

Once a method is registered in the harmonization sheet, it appears in the corresponding tab with the columns
src, method, version, and count automatically populated.
The remaining columns are then completed manually based on available documentation, completeness, and domain knowledge.
These entries—especially the formula column—define how the conversion(harmonized_df, tab_df, prop=prop) function transforms the property values.

The conversion(...) function also removes low-quality entries (e.g., records lacking any documentation of the measurement method).

Example of harmonization sheet
Code
from harmonization_helper import conversion
harmonized_df = df_standardized.copy()
import numpy as np

for prop in ['soc','ph.h2o','clay']:
    print(f'---------------------{prop}-----------------------')
    
    tab = sheet.worksheet(prop.replace('.','_'))
    tabt_values = tab.get_all_values()
    tab_df = pd.DataFrame(tabt_values[1:], columns=tabt_values[0])
    
    harmonized_df = conversion(harmonized_df, tab_df, prop=prop)
    print(harmonized_df.shape)
---------------------soc-----------------------
All methods for soc are in registry.
(50, 58)
---------------------ph.h2o-----------------------
All methods for ph.h2o are in registry.
(50, 57)
---------------------clay-----------------------
All methods for clay are in registry.
(50, 56)

After this, each property one by one should be examined, to normalize units across datasets. specifically, we do: - Checking unit metadata
- Inspecting value distributions when needed
- Removing unrealistic values (extreme outliers, negative values where invalid, etc.)

Code
import matplotlib.pyplot as plt

for prop in ['soc', 'ph.h2o', 'clay']:
    unit  = f"{prop}_unit"
    if unit in harmonized_df.columns:
        print(unit, harmonized_df[unit].unique())
    plt.figure()  # create a new figure for each property
    harmonized_df[prop].hist()
    plt.title(f'Distribution of {prop}')
    plt.xlabel(prop)
    plt.ylabel('Frequency')
    plt.show()
soc_unit ['%']

ph.h2o_unit [None]

clay_unit ['%']

In our case, each property is recorded using a single unit, and the distributions show no unrealistic values.
Therefore, no additional processing is required. We remove all unit and method columns and save the final harmonized dataset as all_harmonized_{version_code}.pq, which includes: - Metadata
- Property information
- Harmonized values

This dataset is ready for spatial overlay and modeling.

Code
harmonize_folder = f'{parent_path}/data_harmonized'
harmonized_df.to_parquet(f'{harmonize_folder}/all_harmonized_{version}.pq')
harmonized_df
lon lat dataset_id subset_id country hzn_dep soc_unit clay_unit year soc_som ... extractable.k_method carbonates_unit carbonates_method cf.vol cf.vol_unit cf.vol_method anonymized soc ph.h2o clay
0 -50.570452 -27.342563 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.8717 5.8 68.0
1 -50.570453 -27.343003 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.9471 5.9 68.0
2 -50.570818 -27.343238 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7209 5.6 66.0
3 -50.570816 -27.342786 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7963 5.4 76.0
4 -50.570804 -27.342325 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.5701 5.9 68.0
5 -50.571298 -27.342560 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.5701 5.8 71.0
6 -50.571314 -27.342993 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7963 5.5 76.0
7 -50.571310 -27.343448 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 3.2500 5.6 71.0
8 -50.571783 -27.343668 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.8717 5.7 66.0
9 -50.571795 -27.343202 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 3.1746 5.8 68.0
10 -50.571797 -27.342762 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.9471 5.8 73.0
11 -50.571759 -27.342352 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 3.0238 6.1 71.0
12 -50.572273 -27.342384 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.9471 5.4 73.0
13 -50.572326 -27.342804 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7209 5.6 76.0
14 -50.572354 -27.343230 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 3.0992 5.4 71.0
15 -50.572352 -27.343617 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7209 6.4 63.0
16 -50.572340 -27.344058 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7209 6.5 66.0
17 -50.572822 -27.343929 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4180 6.0 71.0
18 -50.572826 -27.343460 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7963 6.3 71.0
19 -50.572833 -27.343003 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7209 5.8 71.0
20 -50.572821 -27.342555 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.8717 6.0 68.0
21 -50.572814 -27.342086 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.3426 6.3 66.0
22 -50.572807 -27.341693 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.1164 6.5 73.0
23 -50.573328 -27.341875 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.1918 6.9 68.0
24 -50.573352 -27.342310 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.5701 6.3 73.0
25 -50.573325 -27.342755 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.3426 5.9 71.0
26 -50.573338 -27.343201 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.5701 5.7 73.0
27 -50.573324 -27.343694 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.6455 5.8 86.0
28 -50.573345 -27.344150 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.5701 5.6 73.0
29 -50.573823 -27.343942 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4947 5.4 76.0
30 -50.573821 -27.343467 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.8717 5.2 78.0
31 -50.573834 -27.343019 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4180 6.0 73.0
32 -50.573864 -27.342593 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7209 6.0 71.0
33 -50.573853 -27.342120 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7209 6.5 73.0
34 -50.573858 -27.341655 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4180 6.4 71.0
35 -50.574332 -27.341431 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 3.2500 6.3 66.0
36 -50.574363 -27.341916 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7209 5.4 66.0
37 -50.574369 -27.342318 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.5701 5.6 78.0
38 -50.574377 -27.342767 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4180 5.5 76.0
39 -50.574357 -27.343250 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 3.0238 6.0 73.0
40 -50.574358 -27.343704 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 3.0992 5.9 66.0
41 -50.574355 -27.344146 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.8717 6.4 76.0
42 -50.574895 -27.344191 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.5701 5.8 71.0
43 -50.574851 -27.343670 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4180 5.6 76.0
44 -50.574853 -27.343268 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4180 5.6 76.0
45 -50.575365 -27.343456 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.7209 4.9 81.0
46 -50.575372 -27.343937 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4180 5.4 78.0
47 -50.575377 -27.344352 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4947 5.4 76.0
48 -50.575820 -27.344124 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.4180 5.1 78.0
49 -50.575857 -27.343685 LimeSoDa SC.50 Brazil 10.0 % % 2013.0 NaN ... None None None NaN None None NaN 2.5701 5.4 78.0

50 rows × 56 columns