|
38 | 38 | * [NAPSIntegratedDataExporter](#napsintegrateddataexporter) |
39 | 39 | - [How To Run Individual Tools](#how-to-run-individual-tools) |
40 | 40 | - [Pollutants](#pollutants) |
| 41 | +- [Methods and Report Types](#methods_and_report_types) |
41 | 42 | - [Database Design](#database-design) |
42 | 43 | - [Known Issues](#known-issues) |
43 | 44 | - [Developer Notes](#developer-notes) |
@@ -438,7 +439,7 @@ You can invoke this tool by running the class `com.dbf.naps.data.loader.continuo |
438 | 439 |
|
439 | 440 | ## NAPSContinuousDataQuery |
440 | 441 |
|
441 | | -This powerful Java tool allows you to dynamically query the NAPS continuous data that was loaded into a PostgreSQL database using the [NAPSContinuousDataLoader](#napscontinuousdataloader). It will output a CSV file containing a table of data based on the query rules that you provide. This tool is intended to be used for aggregating data (i.e. average, sum, minimum, maximum, etc.) that is grouped by one or more fields (e.g. pollutant, site, year, month, day, etc.). If you need to generate large tables of data that do not involve grouping functions, have a look at the [NAPSContinuousDataExporter](#napscontinuousdataexporter). |
| 442 | +This powerful Java tool allows you to dynamically query the NAPS continuous data that was loaded into a PostgreSQL database using the [NAPSContinuousDataLoader](#napscontinuousdataloader). It will output a CSV and/or JSON file containing a table or map of data based on the query rules that you provide. This tool is intended to be used for aggregating data (i.e. average, sum, minimum, maximum, etc.) that is grouped by one or more fields (e.g. pollutant, site, year, month, day, etc.). If you need to generate large tables of data that do not involve grouping functions, have a look at the [NAPSContinuousDataExporter](#napscontinuousdataexporter). |
442 | 443 |
|
443 | 444 | You can invoke this tool by running the class `com.dbf.naps.data.analysis.query.continuous.NAPSContinuousDataQuery`. |
444 | 445 |
|
@@ -528,8 +529,8 @@ You can invoke this tool by running the class `com.dbf.naps.data.analysis.query. |
528 | 529 | - The possible values for `urbanization` are `LU, MU, SU, NU`, representing `Large Urban, Medium Urban, Small Urban, Rural (Non Urban)` (respectively). |
529 | 530 | - Both site (station) names and city names are treated as case-insensitive partial matches. This means a value of `labrador` will match the city name of `LABRADOR CITY`. |
530 | 531 | - See the [section below](#pollutants) for a list of all supported pollutants. |
531 | | -- The possible values for `methods` are `170, 181, 184, 195, 236, 636, 703, 706, 731, 760`. These represent the main analytical methods used for analysis and only apply to the PM2.5 pollutant. All other pollutants are have a value of `N/A`. |
532 | | -- The possible values for `reportTypes` are `CO, NO, NO2, NOX, O3, PM10, PM2.5, SO2`, corresponding directly to the pollutant names. These represent the type of report from which the data was originally sourced. |
| 532 | +- The possible values for `methods` are `170, 181, 184, 195, 236, 636, 703, 706, 731, 760`. These represent the main analytical methods used for analysis and only apply to the PM2.5 pollutant. All other pollutants are have a value of `N/A`. See the [section below](#methods_and_report_types) for more information. |
| 533 | +- The possible values for `reportTypes` are `CO, NO, NO2, NOX, O3, PM10, PM2.5, SO2`, corresponding directly to the pollutant names. These represent the type of report from which the data was originally sourced. See the [section below](#methods_and_report_types) for more information. |
533 | 534 |
|
534 | 535 | **Other Notes:** |
535 | 536 | - A title will be automatically generated for the report based on the aggregation and filtering rules that you provide. You can override this title by using the `--title` option. Setting it to empty `""` will omit it entirely. |
@@ -731,8 +732,8 @@ The default colour palette, if not specified, is number 1. Here are examples of |
731 | 732 | - The possible values for `urbanization` are `LU, MU, SU, NU`, representing `Large Urban, Medium Urban, Small Urban, Rural (Non Urban)` (respectively). |
732 | 733 | - Both site (station) names and city names are treated as case-insensitive partial matches. This means a value of `labrador` will match the city name of `LABRADOR CITY`. |
733 | 734 | - See the [section below](#pollutants) for a list of all supported pollutants. |
734 | | -- The possible values for `methods` are `170, 181, 184, 195, 236, 636, 703, 706, 731, 760`. These represent the main analytical methods used for analysis and only apply to the PM2.5 pollutant. All other pollutants are have a value of `N/A`. |
735 | | -- The possible values for `reportTypes` are `CO, NO, NO2, NOX, O3, PM10, PM2.5, SO2`, corresponding directly to the pollutant names. These represent the type of report from which the data was originally sourced. |
| 735 | +- The possible values for `methods` are `170, 181, 184, 195, 236, 636, 703, 706, 731, 760`. These represent the main analytical methods used for analysis and only apply to the PM2.5 pollutant. All other pollutants are have a value of `N/A`. See the [section below](#methods_and_report_types) for more information. |
| 736 | +- The possible values for `reportTypes` are `CO, NO, NO2, NOX, O3, PM10, PM2.5, SO2`, corresponding directly to the pollutant names. These represent the type of report from which the data was originally sourced. See the [section below](#methods_and_report_types) for more information. |
736 | 737 |
|
737 | 738 | **Rendering Options:** |
738 | 739 | - The `colourLowerBound` and `colourUpperBound` can be used to limit the scale that is mapped to the colour gradient. This is useful for helping to emphasize differences that appear in the centre of the overall range of values, or preventing outliers from shifting the entire scale. When specified, the legend will indicate that either the lower or upper bound by adding `>=` and `<=` to the bottom and top of the scale, respectively. If not specified, then the minimum and maximum values of the colour gradient scale will be calculated automatically. |
@@ -815,7 +816,7 @@ You can invoke this tool by running the class `com.dbf.naps.data.loader.integrat |
815 | 816 |
|
816 | 817 | ## NAPSIntegratedDataQuery |
817 | 818 |
|
818 | | -This powerful Java tool allows you to dynamically query the NAPS integrated data that was loaded into a PostgreSQL database using the [NAPSIntegratedDataLoader](#napsintegrateddataloader). It will output a CSV file containing a table of data based on the query rules that you provide. It functions the same as the [NAPSContinuousDataQuery](#napscontinuousdataloader) and accepts all of the same command line arguments, with the exception that the data fields used for grouping cannot include `HOUR`, since hour attribute only applies to continuous data, not integrated data. |
| 819 | +This powerful Java tool allows you to dynamically query the NAPS integrated data that was loaded into a PostgreSQL database using the [NAPSIntegratedDataLoader](#napsintegrateddataloader). It will output a CSV and/or JSON file containing a table or map of data based on the query rules that you provide. It functions the same as the [NAPSContinuousDataQuery](#napscontinuousdataloader) and accepts all of the same command line arguments, with the exception that the data fields used for grouping cannot include `HOUR`, since hour attribute only applies to continuous data, not integrated data. |
819 | 820 |
|
820 | 821 | You can invoke this tool by running the class `com.dbf.naps.data.analysis.query.integrated.NAPSIntegratedDataQuery`. |
821 | 822 |
|
@@ -875,8 +876,8 @@ You can invoke this tool by running the class `com.dbf.naps.data.analysis.query. |
875 | 876 | **Notes:** |
876 | 877 | - Possible values for `group1` through `group5` are `YEAR,MONTH, DAY, DAY_OF_WEEK, DAY_OF_YEAR, WEEK_OF_YEAR, NAPS_ID, POLLUTANT, PROVINCE_TERRITORY, SITE_TYPE, URBANIZATION`. |
877 | 878 | - AQHI values are not supported for the integrated data set. |
878 | | -- The possible values for `methods` are `ED-XRF, GC-FID, GC-MS, GC-MS TP+G, HPLC, IC, IC-PAD, ICPMS, Microbalance, TOR, WICPMS`. These represent the main analytical methods used for analysis. |
879 | | -- The possible values for `reportTypes` are `CARB, CARBONYLS, DICHOT, HCB, IC, ICPMS, LEV, NA, NH4, PAH, PCB, PCDD, PM10, PM2.5, PM2.5-10, SPEC, VOC, VOC_4HR, WICPMS`. These represent the type of report from which the data was originally sourced. |
| 879 | +- The possible values for `methods` are `ED-XRF, GC-FID, GC-MS, GC-MS TP+G, HPLC, IC, IC-PAD, ICPMS, Microbalance, TOR, WICPMS`. These represent the main analytical methods used for analysis. See the [section below](#methods_and_report_types) for more information. |
| 880 | +- The possible values for `reportTypes` are `CARB, CARBONYLS, DICHOT, HCB, IC, ICPMS, LEV, NA, NH4, PAH, PCB, PCDD, PM10, PM2.5, PM2.5-10, SPEC, VOC, VOC_4HR, WICPMS`. These represent the type of report from which the data was originally sourced. See the [section below](#methods_and_report_types) for more information. |
880 | 881 | - With the exception of the above, all of the other rules and restrictions of the [NAPSContinuousDataQuery](#napscontinuousdataquery) apply. |
881 | 882 |
|
882 | 883 | ## NAPSIntegratedHeatMap |
@@ -943,8 +944,8 @@ You can invoke this tool by running the class `com.dbf.naps.data.analysis.heatma |
943 | 944 | **Notes:** |
944 | 945 | - Possible values for `group1` and `group2` are `YEAR,MONTH, DAY, DAY_OF_WEEK, DAY_OF_YEAR, WEEK_OF_YEAR, NAPS_ID, POLLUTANT, PROVINCE_TERRITORY, SITE_TYPE, URBANIZATION`. |
945 | 946 | - AQHI values are not supported for the integrated data set. |
946 | | -- The possible values for `methods` are `ED-XRF, GC-FID, GC-MS, GC-MS TP+G, HPLC, IC, IC-PAD, ICPMS, Microbalance, TOR, WICPMS`. These represent the main analytical methods used for analysis. |
947 | | -- The possible values for `reportTypes` are `CARB, CARBONYLS, DICHOT, HCB, IC, ICPMS, LEV, NA, NH4, PAH, PCB, PCDD, PM10, PM2.5, PM2.5-10, SPEC, VOC, VOC_4HR, WICPMS`. These represent the type of report from which the data was originally sourced. |
| 947 | +- The possible values for `methods` are `ED-XRF, GC-FID, GC-MS, GC-MS TP+G, HPLC, IC, IC-PAD, ICPMS, Microbalance, TOR, WICPMS`. These represent the main analytical methods used for analysis. See the [section below](#methods_and_report_types) for more information. |
| 948 | +- The possible values for `reportTypes` are `CARB, CARBONYLS, DICHOT, HCB, IC, ICPMS, LEV, NA, NH4, PAH, PCB, PCDD, PM10, PM2.5, PM2.5-10, SPEC, VOC, VOC_4HR, WICPMS`. These represent the type of report from which the data was originally sourced. See the [section below](#methods_and_report_types) for more information. |
948 | 949 | - With the exception of the above, all of the other rules and restrictions of the [NAPSContinuousHeatMap](#napscontinuousheatmap) apply. |
949 | 950 |
|
950 | 951 | ## NAPSIntegratedDataExporter |
@@ -1386,6 +1387,77 @@ The following table lists all of the compounds (pollutants) and how many data po |
1386 | 1387 | |Zinc|128947| |
1387 | 1388 | </details> |
1388 | 1389 |
|
| 1390 | +# Methods and Report Types |
| 1391 | + |
| 1392 | +I have tried my best to categorize all of the NAPS data based on both the broad analytical method that was used for analysis (eg. GC-MS, IC, ED-XRF, etc), and the report type from where the data originated. The report type roughly corresponds to the naming scheme of the raw data files posted on the [NAPS data portal](https://data-donnees.az.ec.gc.ca/data/air/monitor/national-air-pollution-surveillance-naps-program/), which are not always consistent throughout the years. |
| 1393 | + |
| 1394 | +<details> |
| 1395 | +<summary>Table of Methods and Report Types</summary> |
| 1396 | + |
| 1397 | +### As of March 2025 |
| 1398 | + |
| 1399 | +**Continuous** |
| 1400 | +|Report Type|Method|Units| |
| 1401 | +|:--- | :---| :---| |
| 1402 | +|CO|N/A|ppm| |
| 1403 | +|O3|N/A|ppb| |
| 1404 | +|PM10|N/A|µg/m³| |
| 1405 | +|SO2|N/A|ppb| |
| 1406 | +|NO2|N/A|ppb| |
| 1407 | +|NOX|N/A|ppb| |
| 1408 | +|NO|N/A|ppb| |
| 1409 | +|PM2.5|706|µg/m³| |
| 1410 | +|PM2.5|731|µg/m³| |
| 1411 | +|PM2.5|170|µg/m³| |
| 1412 | +|PM2.5|184|µg/m³| |
| 1413 | +|PM2.5|703|µg/m³| |
| 1414 | +|PM2.5|760|µg/m³| |
| 1415 | +|PM2.5|181|µg/m³| |
| 1416 | +|PM2.5|195|µg/m³| |
| 1417 | +|PM2.5|236|µg/m³| |
| 1418 | +|PM2.5|636|µg/m³| |
| 1419 | + |
| 1420 | +**Integrated** |
| 1421 | +|Report Type|Method|Units| |
| 1422 | +|:--- | :---| :---| |
| 1423 | +|PCB|GC-MS|pg/m³| |
| 1424 | +|LEV|IC|µg/m³| |
| 1425 | +|PAH|GC-MS TP+G|µg/m³| |
| 1426 | +|PM2.5|ICPMS|µg/m³| |
| 1427 | +|PM2.5|TOR|µg/m³| |
| 1428 | +|PM2.5|IC-PAD|µg/m³| |
| 1429 | +|PM2.5|IC|ppbv| |
| 1430 | +|PM2.5-10|Microbalance|µg/m³| |
| 1431 | +|PM2.5-10|ED-XRF|µg/m³| |
| 1432 | +|PM2.5-10|IC|µg/m³| |
| 1433 | +|PM2.5-10|ICPMS|µg/m³| |
| 1434 | +|PAH|Microbalance|µg/m³| |
| 1435 | +|CARBONYLS|HPLC|µg/m³| |
| 1436 | +|VOC_4HR|GC-FID|µg/m³| |
| 1437 | +|VOC_4HR|GC-MS|µg/m³| |
| 1438 | +|PM10|Microbalance|µg/m³| |
| 1439 | +|PM10|ED-XRF|µg/m³| |
| 1440 | +|DICHOT|Microbalance|µg/m³| |
| 1441 | +|DICHOT|IC|µg/m³| |
| 1442 | +|DICHOT|ED-XRF|µg/m³| |
| 1443 | +|PAH|GC-MS|µg/m³| |
| 1444 | +|PCDD|GC-MS|pg/m³| |
| 1445 | +|VOC|GC-FID|µg/m³| |
| 1446 | +|VOC|GC-MS|µg/m³| |
| 1447 | +|HCB|GC-MS|µg/m³| |
| 1448 | +|PM2.5|Microbalance|µg/m³| |
| 1449 | +|PM2.5|IC|µg/m³| |
| 1450 | +|PM2.5|ED-XRF|µg/m³| |
| 1451 | +|CARB|TOR|µg/m³| |
| 1452 | +|WICPMS|WICPMS|µg/m³| |
| 1453 | +|SPEC|IC|µg/m³| |
| 1454 | +|SPEC|TOR|µg/m³| |
| 1455 | +|NH4|IC|µg/m³| |
| 1456 | +|IC|IC|µg/m³| |
| 1457 | +|ICPMS|ICPMS|µg/m³| |
| 1458 | +|NA|IC|µg/m³| |
| 1459 | +</details> |
| 1460 | + |
1389 | 1461 | # Database Design |
1390 | 1462 |
|
1391 | 1463 | I am using a normalized relational PostgreSQL database to store the data. I have chosen to hold the continuous data and the integrated data in separate tables to improve performance. I don't think there is a frequent need to query the data in both tables at the same time. The following diagram illustrates the schema design. |
|
0 commit comments