Skip to content

Commit f052d40

Browse files
Update README.md
1 parent 40eb418 commit f052d40

1 file changed

Lines changed: 82 additions & 10 deletions

File tree

README.md

Lines changed: 82 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@
3838
* [NAPSIntegratedDataExporter](#napsintegrateddataexporter)
3939
- [How To Run Individual Tools](#how-to-run-individual-tools)
4040
- [Pollutants](#pollutants)
41+
- [Methods and Report Types](#methods_and_report_types)
4142
- [Database Design](#database-design)
4243
- [Known Issues](#known-issues)
4344
- [Developer Notes](#developer-notes)
@@ -438,7 +439,7 @@ You can invoke this tool by running the class `com.dbf.naps.data.loader.continuo
438439

439440
## NAPSContinuousDataQuery
440441

441-
This powerful Java tool allows you to dynamically query the NAPS continuous data that was loaded into a PostgreSQL database using the [NAPSContinuousDataLoader](#napscontinuousdataloader). It will output a CSV file containing a table of data based on the query rules that you provide. This tool is intended to be used for aggregating data (i.e. average, sum, minimum, maximum, etc.) that is grouped by one or more fields (e.g. pollutant, site, year, month, day, etc.). If you need to generate large tables of data that do not involve grouping functions, have a look at the [NAPSContinuousDataExporter](#napscontinuousdataexporter).
442+
This powerful Java tool allows you to dynamically query the NAPS continuous data that was loaded into a PostgreSQL database using the [NAPSContinuousDataLoader](#napscontinuousdataloader). It will output a CSV and/or JSON file containing a table or map of data based on the query rules that you provide. This tool is intended to be used for aggregating data (i.e. average, sum, minimum, maximum, etc.) that is grouped by one or more fields (e.g. pollutant, site, year, month, day, etc.). If you need to generate large tables of data that do not involve grouping functions, have a look at the [NAPSContinuousDataExporter](#napscontinuousdataexporter).
442443

443444
You can invoke this tool by running the class `com.dbf.naps.data.analysis.query.continuous.NAPSContinuousDataQuery`.
444445

@@ -528,8 +529,8 @@ You can invoke this tool by running the class `com.dbf.naps.data.analysis.query.
528529
- The possible values for `urbanization` are `LU, MU, SU, NU`, representing `Large Urban, Medium Urban, Small Urban, Rural (Non Urban)` (respectively).
529530
- Both site (station) names and city names are treated as case-insensitive partial matches. This means a value of `labrador` will match the city name of `LABRADOR CITY`.
530531
- See the [section below](#pollutants) for a list of all supported pollutants.
531-
- The possible values for `methods` are `170, 181, 184, 195, 236, 636, 703, 706, 731, 760`. These represent the main analytical methods used for analysis and only apply to the PM2.5 pollutant. All other pollutants are have a value of `N/A`.
532-
- The possible values for `reportTypes` are `CO, NO, NO2, NOX, O3, PM10, PM2.5, SO2`, corresponding directly to the pollutant names. These represent the type of report from which the data was originally sourced.
532+
- The possible values for `methods` are `170, 181, 184, 195, 236, 636, 703, 706, 731, 760`. These represent the main analytical methods used for analysis and only apply to the PM2.5 pollutant. All other pollutants are have a value of `N/A`. See the [section below](#methods_and_report_types) for more information.
533+
- The possible values for `reportTypes` are `CO, NO, NO2, NOX, O3, PM10, PM2.5, SO2`, corresponding directly to the pollutant names. These represent the type of report from which the data was originally sourced. See the [section below](#methods_and_report_types) for more information.
533534

534535
**Other Notes:**
535536
- A title will be automatically generated for the report based on the aggregation and filtering rules that you provide. You can override this title by using the `--title` option. Setting it to empty `""` will omit it entirely.
@@ -731,8 +732,8 @@ The default colour palette, if not specified, is number 1. Here are examples of
731732
- The possible values for `urbanization` are `LU, MU, SU, NU`, representing `Large Urban, Medium Urban, Small Urban, Rural (Non Urban)` (respectively).
732733
- Both site (station) names and city names are treated as case-insensitive partial matches. This means a value of `labrador` will match the city name of `LABRADOR CITY`.
733734
- See the [section below](#pollutants) for a list of all supported pollutants.
734-
- The possible values for `methods` are `170, 181, 184, 195, 236, 636, 703, 706, 731, 760`. These represent the main analytical methods used for analysis and only apply to the PM2.5 pollutant. All other pollutants are have a value of `N/A`.
735-
- The possible values for `reportTypes` are `CO, NO, NO2, NOX, O3, PM10, PM2.5, SO2`, corresponding directly to the pollutant names. These represent the type of report from which the data was originally sourced.
735+
- The possible values for `methods` are `170, 181, 184, 195, 236, 636, 703, 706, 731, 760`. These represent the main analytical methods used for analysis and only apply to the PM2.5 pollutant. All other pollutants are have a value of `N/A`. See the [section below](#methods_and_report_types) for more information.
736+
- The possible values for `reportTypes` are `CO, NO, NO2, NOX, O3, PM10, PM2.5, SO2`, corresponding directly to the pollutant names. These represent the type of report from which the data was originally sourced. See the [section below](#methods_and_report_types) for more information.
736737

737738
**Rendering Options:**
738739
- The `colourLowerBound` and `colourUpperBound` can be used to limit the scale that is mapped to the colour gradient. This is useful for helping to emphasize differences that appear in the centre of the overall range of values, or preventing outliers from shifting the entire scale. When specified, the legend will indicate that either the lower or upper bound by adding `>=` and `<=` to the bottom and top of the scale, respectively. If not specified, then the minimum and maximum values of the colour gradient scale will be calculated automatically.
@@ -815,7 +816,7 @@ You can invoke this tool by running the class `com.dbf.naps.data.loader.integrat
815816

816817
## NAPSIntegratedDataQuery
817818

818-
This powerful Java tool allows you to dynamically query the NAPS integrated data that was loaded into a PostgreSQL database using the [NAPSIntegratedDataLoader](#napsintegrateddataloader). It will output a CSV file containing a table of data based on the query rules that you provide. It functions the same as the [NAPSContinuousDataQuery](#napscontinuousdataloader) and accepts all of the same command line arguments, with the exception that the data fields used for grouping cannot include `HOUR`, since hour attribute only applies to continuous data, not integrated data.
819+
This powerful Java tool allows you to dynamically query the NAPS integrated data that was loaded into a PostgreSQL database using the [NAPSIntegratedDataLoader](#napsintegrateddataloader). It will output a CSV and/or JSON file containing a table or map of data based on the query rules that you provide. It functions the same as the [NAPSContinuousDataQuery](#napscontinuousdataloader) and accepts all of the same command line arguments, with the exception that the data fields used for grouping cannot include `HOUR`, since hour attribute only applies to continuous data, not integrated data.
819820

820821
You can invoke this tool by running the class `com.dbf.naps.data.analysis.query.integrated.NAPSIntegratedDataQuery`.
821822

@@ -875,8 +876,8 @@ You can invoke this tool by running the class `com.dbf.naps.data.analysis.query.
875876
**Notes:**
876877
- Possible values for `group1` through `group5` are `YEAR,MONTH, DAY, DAY_OF_WEEK, DAY_OF_YEAR, WEEK_OF_YEAR, NAPS_ID, POLLUTANT, PROVINCE_TERRITORY, SITE_TYPE, URBANIZATION`.
877878
- AQHI values are not supported for the integrated data set.
878-
- The possible values for `methods` are `ED-XRF, GC-FID, GC-MS, GC-MS TP+G, HPLC, IC, IC-PAD, ICPMS, Microbalance, TOR, WICPMS`. These represent the main analytical methods used for analysis.
879-
- The possible values for `reportTypes` are `CARB, CARBONYLS, DICHOT, HCB, IC, ICPMS, LEV, NA, NH4, PAH, PCB, PCDD, PM10, PM2.5, PM2.5-10, SPEC, VOC, VOC_4HR, WICPMS`. These represent the type of report from which the data was originally sourced.
879+
- The possible values for `methods` are `ED-XRF, GC-FID, GC-MS, GC-MS TP+G, HPLC, IC, IC-PAD, ICPMS, Microbalance, TOR, WICPMS`. These represent the main analytical methods used for analysis. See the [section below](#methods_and_report_types) for more information.
880+
- The possible values for `reportTypes` are `CARB, CARBONYLS, DICHOT, HCB, IC, ICPMS, LEV, NA, NH4, PAH, PCB, PCDD, PM10, PM2.5, PM2.5-10, SPEC, VOC, VOC_4HR, WICPMS`. These represent the type of report from which the data was originally sourced. See the [section below](#methods_and_report_types) for more information.
880881
- With the exception of the above, all of the other rules and restrictions of the [NAPSContinuousDataQuery](#napscontinuousdataquery) apply.
881882

882883
## NAPSIntegratedHeatMap
@@ -943,8 +944,8 @@ You can invoke this tool by running the class `com.dbf.naps.data.analysis.heatma
943944
**Notes:**
944945
- Possible values for `group1` and `group2` are `YEAR,MONTH, DAY, DAY_OF_WEEK, DAY_OF_YEAR, WEEK_OF_YEAR, NAPS_ID, POLLUTANT, PROVINCE_TERRITORY, SITE_TYPE, URBANIZATION`.
945946
- AQHI values are not supported for the integrated data set.
946-
- The possible values for `methods` are `ED-XRF, GC-FID, GC-MS, GC-MS TP+G, HPLC, IC, IC-PAD, ICPMS, Microbalance, TOR, WICPMS`. These represent the main analytical methods used for analysis.
947-
- The possible values for `reportTypes` are `CARB, CARBONYLS, DICHOT, HCB, IC, ICPMS, LEV, NA, NH4, PAH, PCB, PCDD, PM10, PM2.5, PM2.5-10, SPEC, VOC, VOC_4HR, WICPMS`. These represent the type of report from which the data was originally sourced.
947+
- The possible values for `methods` are `ED-XRF, GC-FID, GC-MS, GC-MS TP+G, HPLC, IC, IC-PAD, ICPMS, Microbalance, TOR, WICPMS`. These represent the main analytical methods used for analysis. See the [section below](#methods_and_report_types) for more information.
948+
- The possible values for `reportTypes` are `CARB, CARBONYLS, DICHOT, HCB, IC, ICPMS, LEV, NA, NH4, PAH, PCB, PCDD, PM10, PM2.5, PM2.5-10, SPEC, VOC, VOC_4HR, WICPMS`. These represent the type of report from which the data was originally sourced. See the [section below](#methods_and_report_types) for more information.
948949
- With the exception of the above, all of the other rules and restrictions of the [NAPSContinuousHeatMap](#napscontinuousheatmap) apply.
949950

950951
## NAPSIntegratedDataExporter
@@ -1386,6 +1387,77 @@ The following table lists all of the compounds (pollutants) and how many data po
13861387
|Zinc|128947|
13871388
</details>
13881389

1390+
# Methods and Report Types
1391+
1392+
I have tried my best to categorize all of the NAPS data based on both the broad analytical method that was used for analysis (eg. GC-MS, IC, ED-XRF, etc), and the report type from where the data originated. The report type roughly corresponds to the naming scheme of the raw data files posted on the [NAPS data portal](https://data-donnees.az.ec.gc.ca/data/air/monitor/national-air-pollution-surveillance-naps-program/), which are not always consistent throughout the years.
1393+
1394+
<details>
1395+
<summary>Table of Methods and Report Types</summary>
1396+
1397+
### As of March 2025
1398+
1399+
**Continuous**
1400+
|Report Type|Method|Units|
1401+
|:--- | :---| :---|
1402+
|CO|N/A|ppm|
1403+
|O3|N/A|ppb|
1404+
|PM10|N/A|µg/m³|
1405+
|SO2|N/A|ppb|
1406+
|NO2|N/A|ppb|
1407+
|NOX|N/A|ppb|
1408+
|NO|N/A|ppb|
1409+
|PM2.5|706|µg/m³|
1410+
|PM2.5|731|µg/m³|
1411+
|PM2.5|170|µg/m³|
1412+
|PM2.5|184|µg/m³|
1413+
|PM2.5|703|µg/m³|
1414+
|PM2.5|760|µg/m³|
1415+
|PM2.5|181|µg/m³|
1416+
|PM2.5|195|µg/m³|
1417+
|PM2.5|236|µg/m³|
1418+
|PM2.5|636|µg/m³|
1419+
1420+
**Integrated**
1421+
|Report Type|Method|Units|
1422+
|:--- | :---| :---|
1423+
|PCB|GC-MS|pg/m³|
1424+
|LEV|IC|µg/m³|
1425+
|PAH|GC-MS TP+G|µg/m³|
1426+
|PM2.5|ICPMS|µg/m³|
1427+
|PM2.5|TOR|µg/m³|
1428+
|PM2.5|IC-PAD|µg/m³|
1429+
|PM2.5|IC|ppbv|
1430+
|PM2.5-10|Microbalance|µg/m³|
1431+
|PM2.5-10|ED-XRF|µg/m³|
1432+
|PM2.5-10|IC|µg/m³|
1433+
|PM2.5-10|ICPMS|µg/m³|
1434+
|PAH|Microbalance|µg/m³|
1435+
|CARBONYLS|HPLC|µg/m³|
1436+
|VOC_4HR|GC-FID|µg/m³|
1437+
|VOC_4HR|GC-MS|µg/m³|
1438+
|PM10|Microbalance|µg/m³|
1439+
|PM10|ED-XRF|µg/m³|
1440+
|DICHOT|Microbalance|µg/m³|
1441+
|DICHOT|IC|µg/m³|
1442+
|DICHOT|ED-XRF|µg/m³|
1443+
|PAH|GC-MS|µg/m³|
1444+
|PCDD|GC-MS|pg/m³|
1445+
|VOC|GC-FID|µg/m³|
1446+
|VOC|GC-MS|µg/m³|
1447+
|HCB|GC-MS|µg/m³|
1448+
|PM2.5|Microbalance|µg/m³|
1449+
|PM2.5|IC|µg/m³|
1450+
|PM2.5|ED-XRF|µg/m³|
1451+
|CARB|TOR|µg/m³|
1452+
|WICPMS|WICPMS|µg/m³|
1453+
|SPEC|IC|µg/m³|
1454+
|SPEC|TOR|µg/m³|
1455+
|NH4|IC|µg/m³|
1456+
|IC|IC|µg/m³|
1457+
|ICPMS|ICPMS|µg/m³|
1458+
|NA|IC|µg/m³|
1459+
</details>
1460+
13891461
# Database Design
13901462

13911463
I am using a normalized relational PostgreSQL database to store the data. I have chosen to hold the continuous data and the integrated data in separate tables to improve performance. I don't think there is a frequent need to query the data in both tables at the same time. The following diagram illustrates the schema design.

0 commit comments

Comments
 (0)