add Kimber soiling model #860

mikofski · 2020-01-15T09:07:00Z

Closes Kimber Dynamic Soiling model #669
I am familiar with the contributing guidelines
Tests added
Updates entries to docs/sphinx/source/api.rst for API changes.
Adds description and name entries in the appropriate "what's new" file in docs/sphinx/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
Pull request is nearly complete and ready for detailed review.
Maintainer: Appropriate GitHub Labels and Milestone are assigned to the Pull Request and linked Issue.

generated with the following snippet:

from pvlib.losses import soiling_kimber
import pvlib
x = pvlib.iotools.read_tmy3('path/to/722287TYA.CSV', coerce_year=1990)
z = soiling_kimber(x[0].Lprecipdepth)
z.plot()
import datetime
manwash = [datetime.date(1990,11,1), ]
z = soiling_kimber(x[0].Lprecipdepth, manual_wash_dates=manwash)
z.plot()
from matplotlib import pyplot as plt
plt.grid()
plt.title('Kimber Soiling Model')
plt.ylabel('soiling build-up fraction')
x[1]
plt.legend([x[1]['Name'], 'manual wash Nov. 1st'])

cwhanse · 2020-01-15T17:04:33Z

with this and #850 do we want to:

put all loss functions into losses.py (for now),
put all soiling loss functions into soiling.py and anticipate N x other_loss_type.py, or
create a losses folder to contain soiling.py, etc.?

mikofski · 2020-01-15T17:39:40Z

Oops! Sorry, I can move this to losses.py. I say we kick this question down the road until we are forced to decide. I don't think it will make a big difference now, and hopefully, we should be able to handle the API changes later without too much stress. Is that okay?

Also I was thinking of waiting on this until #850 and #859 are merged, just to make it easier to sync. Thanks!

pvlib/soiling.py

Signed-off-by: Mark Mikofski <[email protected]>

* also add a gallery example draft

mikofski · 2020-02-13T02:12:27Z

@kanderso-nrel I get this error from my example:

Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/pvlib-python/checkouts/860/docs/examples/plot_greensboro_kimber_soiling.py", line 48, in <module>
    soiling_no_wash.index, soiling_no_wash.values*100.0)
TypeError: float() argument must be a string or a number, not 'Timestamp'

But it works on my machine, do I have the wrong matplotlib version? How do you plot timestamps? I guess I can plot from pandas instead of using pyplot

* use pd.to_datetime for plt * start to add manwash test using datetime

kandersolar · 2020-02-13T02:35:47Z

I installed the same matplotlib and numpy versions as RTD is using and can recreate the error with pandas 1.0.1, but when using pandas 0.25.1 (instead of 1.0.1 like RTD) I get this warning:

/home/kevin/anaconda3/lib/python3.7/site-packages/pandas/plotting/_matplotlib/converter.py:103: 
FutureWarning: Using an implicitly registered datetime converter for a matplotlib plotting method. 
The converter was registered by pandas on import.
Future versions of pandas will require you to explicitly register matplotlib converters.

To register the converters:
	>>> from pandas.plotting import register_matplotlib_converters
	>>> register_matplotlib_converters()
  warnings.warn(msg, FutureWarning)

Switching to 1.0.1 and implementing that matplotlib registration line appears to fix it. I wonder why they stopped registering it by default, that's a little annoying.

* add test data for manual wash * try to fix example plot using to_pydatetime

mikofski · 2020-02-13T03:23:09Z

All tests are passing, the gallery example tenders, I don't know what that patch problem is. Any reviewers?

kandersolar

Very clean code! I wonder if it would be worth trying to vectorize it? Whenever I see for date in series.index it triggers alarm bells, although I suppose since it's over a daily timeseries the scale is somewhat constrained.

docs/examples/plot_greensboro_kimber_soiling.py

pvlib/losses.py

* reuse rainfall events * combine zero soil assignments

* correct Adrianne spelling * add table of soiling rates from figure 3 from Kimber paper * fix reference indents in docstring

* fix table malformed table in losses * fix long line in docstring

mikofski · 2020-02-13T10:16:29Z

okay, any more comments? @CameronTStark maybe add to v0.7.2?

cwhanse

I'm OK with the default values. I'd prefer different names for some of the argument, soiling_rate in particular, since the unit is not a mass or mass rate.

A minor quibble: I'd prefer that the file names for the sample data and the example start with soiling_kimber to make it easier to associate these files with the function, not difficult now but as more examples accumulate...

~~We should tell the user that the index returned may not be the same as the index of the input rainfall Series.~~

pvlib/losses.py

kandersolar · 2020-02-13T16:00:10Z

A minor quibble: I'd prefer that the file names for the sample data and the example start with soiling_kimber to make it easier to associate these files with the function

FYI sphinx-gallery defaults to only executing examples if their filename starts with plot_. If you want to name gallery examples some other way, you'll need to configure that behavior: https://sphinx-gallery.github.io/stable/configuration.html#parsing-and-executing-examples-via-matching-patterns

I suggest changing that regex to just match everything -- if for some reason in the future we want to include examples in the gallery without executing them, we can make it more specific at that point.

* combine panels cleaned today with grace-period, since it includes today

kandersolar

Three last suggestions from me.

pvlib/losses.py

pvlib/tests/test_losses.py

* also fix missing "for" in note Co-Authored-By: Kevin Anderson <[email protected]>

* remove extra whitespace

mikofski · 2020-02-14T01:07:48Z

@kanderso-nrel you are right, vectorizing this code makes it run about 20 to 30 times faster on my machine (from 130-140ms to 4-6ms, depending on washes), but it requires the input rainfall time-series to be monotonic - which means that it needs some special code to fix TMY3, just like #889 , but this doesn't add much overhead, about 0.5-1 ms, and I don't think it would effect other non-TMY3 time-series.

There are some pros/cons to the vectorized version.

it can handle rainfall at any time of the day or the week by using rolling windows, so it picks up some soiling that the discreet-daily (unvectorized) version doesn't
also it can start building up soiling at any time of the day, or be cleaned anytime
and cleaning is instantaneous, not a day-long process

Here are some plots to compare against, orange is the vectorized version, blue is not. There are minor differences due to the differences above, but I think they might be improvements? I'd have to save new test data.

No Wash

With Wash

The code is copied from the HSU model, which was already vectorized:

Details

def soiling_kimber_vec(rainfall, cleaning_threshold=6, soiling_loss_rate=0.0015,
                       grace_period=14, max_soiling=0.3, manual_wash_dates=None,
                       initial_soiling=0):

    # convert grace_period to timedelta
    grace_period = datetime.timedelta(days=grace_period)

    # protect against TMY3 by rolling back and then adding 1
    rain_index = np.roll(rainfall.index.values, 1) + np.timedelta64(1, 'h')
    periods_per_day = (rain_index[1] - rain_index[0]) / np.timedelta64(24, 'h')
    rain_tz = rainfall.index.tz
    rain_index = pd.DatetimeIndex(
        rain_index).tz_localize('UTC').tz_convert(rain_tz)

    rainfall = pd.Series(rainfall.values, index=rain_index, name='rainfall')

    daily_rainfall = rainfall.rolling('D').sum()  # closed right

    norain = np.ones_like(rainfall.values) * soiling_loss_rate * periods_per_day
    norain[0] = initial_soiling
    norain = np.cumsum(norain)

    rain_events = daily_rainfall > cleaning_threshold

    grace_windows = rain_events.rolling(grace_period).sum() > 0

    soiling = pd.Series(float('NaN'), index=rainfall.index)
    soiling.iloc[0] = initial_soiling
    soiling[grace_windows] = norain[grace_windows]

    # manual wash dates
    if manual_wash_dates is not None:
        manual_wash_dates = pd.DatetimeIndex(manual_wash_dates).tz_localize(rain_tz)
        norain = pd.Series(norain, index=rain_index, name='soiling')
        soiling[manual_wash_dates] = norain[manual_wash_dates]

    return norain - soiling.ffill()

mikofski · 2020-02-14T07:57:15Z

I guess it goes without saying that the faster, more flexible version is everyone's preference?

mikofski · 2020-02-14T09:07:56Z

Todo

fix soiling[0] is zero, not initial_condition
test for initial condition
where greater than max, use max, else soiling less than max
fix time roll interval use dt not 1H, or check for TMY3 input arg?

cwhanse · 2020-02-14T15:28:57Z

I guess it goes without saying that the faster, more flexible version is everyone's preference?

As long as it's consistent with the publication, yes for me.

mikofski · 2020-02-14T21:21:19Z

As long as it's consistent with the publication, yes for me.

Yes, they are consistent, the only difference is that when Adie did it, I think she was mainly looking at daily rainfall, so that was the default accumulation period and time-step interval, which meant that everything begins and ends at midnnight.

With the vectorized version, we can interpret the accumulation period more broadly, as the total rainfall over the last 24-hours. So it could be 8AM to 8AM instead of always midnight to midnight.

The difference occurs if there's rain overnight. For example, if the threshold is 6mm, what if there's 3mm from 8pm to midnight, and then another 3mm from midnight to 6am? In the daily discreet version (unvectorized) this would not clean the panels because the accumulated rainfall 3mm on day-1 and 3mm on day-2 was never greater than the threshold. But in the rolling (vectorized) version, this would clean the panels.

Personally I think this is better, and now that hourly rainfall is more widely available, I don't see why we wouldn't prefer it.

Here's a copy of the paper: Kimber_2006_Full_Paper_r3.pdf

cwhanse · 2020-02-14T21:56:58Z

With the vectorized version, we can interpret the accumulation period more broadly, as the total rainfall over the last 24-hours. So it could be 8AM to 8AM instead of always midnight to midnight.

The rolling window approach is definitely better than the midnight to midnight approach. Because rainfall threshold is a depth rather than a rate, what about an argument accumulation_period that defines the time window over which the threshold applies? If you agree, maybe default to 24 hours to be faithful to the reference.

mikofski · 2020-02-14T23:37:19Z

what about an argument accumulation_period that defines the time window over which the threshold applies? If you agree, maybe default to 24 hours to be faithful to the reference.

an improvement on the model for sure, I like it, thanks!

* use a rolling window and rain_accum_period to get more accurate rain coverage overnite instead of only discrete days from midnite to midnite * add rain_accum_period to allow deviation from kimber model, recommends only 24h but why not be flexible? * add istmy=False to fix last hour of tmy to be monotonically increasing * add another test for initial condition * fix example input arg cleaning_threshold

mikofski · 2020-02-15T03:23:12Z

Ok @CameronTStark, it's ready, any last comments?

CameronTStark

It looks good to go other than the one NIT. I'll merge either way.

pvlib/losses.py

CameronTStark · 2020-02-15T04:46:34Z

pvlib/tests/test_losses.py

+        parse_dates=True, index_col='timestamp')
+
+
+@needs_pandas_0_22


I didn't get to try this but I assume these @needs_pandas_0_22 are necessary. If so I'm even more inclined to bump pandas version to unify the output between versions. No change necessary, just thinking out loud mostly.

Pandas DataFrame.rolling() kwarg closed isn't available for py35-min on Travis but everything else passes, py35-min uses pandas-0.18.1

My understanding is that the soiling_kimber function does not work at all with pandas 0.18. Correct? If so, that's a different kind of breakage than we've traditionally allowed across dependency versions. The function documentation doesn't give any hint as to why it's going to fail in that scenario either. These @needs* decorators have traditionally been used for only optional dependencies or quirks between versions (like pandas sum of nans = nan or = 0).

I thought we were in agreement that we'd bump the pandas spec for 0.8, not 0.7.2. This suggests that we need to do it for 0.7.2. Not the end of the world, but not ideal.

Oops, I didn't realize that. The existing soiling_hsu also used this closed kwarg in DataFrame.rolling()

pvlib-python/pvlib/losses.py

Line 69 in 64a6498

accum_rain = rainfall.rolling(rain_accum_period, closed='right').sum()

but it's pandas dependency was masked in the test by requires_scipy

pvlib-python/pvlib/tests/test_losses.py

Lines 83 to 84 in 64a6498

@requires_scipy

def test_soiling_hsu(rainfall_input, expected_output_2):

One idea is just dropping this closed kwarg from DataFrame.rolling() in both soiling functions, which works fine in pandas-0.18.1, the default behavior is 'both' which isn't the end of the world, but probably not strictly faithful to the references, which imply exclusion of starting timestamp.

Any other ideas @cwhanse? @kanderso-nrel?

These @needs* decorators have traditionally been used for only optional dependencies or quirks between versions (like pandas sum of nans = nan or = 0).

Thanks for clarifying this further for us @wholmgren.

I thought we were in agreement that we'd bump the pandas spec for 0.8, not 0.7.2. This suggests that we need to do it for 0.7.2. Not the end of the world, but not ideal.

We can revert this merge and chamber it for v0.8. It'll give time to address some of the issues raised brought up and stay consistent with our plan to bump pandas at v0.8. What do you think @wholmgren?

I don't think we need to revert. @mikofski pointed out that the same issue exists in the already released soiling_hsu. We could add a note to the doc strings, or a try/except with a better error message around the rolling calls, or do nothing. I'm ok with any of those.

We should just be more careful with the requires and needs decorators. I know I've been lazy about them in the past.

This is all also an argument for 1) adding scipy to the base requirements so the test suite doesn't have gotchas like with the hsu function (issue #483) and 2) being more aggressive with bumping dependency specs so that contributors don't have to spend as much time supporting older libraries (issue #828).

CameronTStark · 2020-02-15T14:22:21Z

Thanks for your great contribution @mikofski and thanks to @kanderso-nrel and @cwhanse for your feedback!

mikofski · 2020-02-15T15:18:20Z

Thanks! Sorry I didn't have a chance to fix the nit. Maybe we can address it later.

docs/examples/plot_greensboro_kimber_soiling.py

pvlib/tests/test_losses.py

mikofski added 2 commits January 15, 2020 01:04

add Kimber soiling model

1f204ce

remove spaces, thx stickler

d9e8c99

typos: aa = a, rest = reset

1b0b74e

mikofski commented Jan 15, 2020

View reviewed changes

pvlib/soiling.py Outdated Show resolved Hide resolved

mikofski added 2 commits January 20, 2020 23:23

move soiling to losses

ed5c444

Merge branch 'master' into kimber_soil

14ef35c

Signed-off-by: Mark Mikofski <[email protected]>

CameronTStark added the enhancement label Jan 28, 2020

CameronTStark modified the milestones: 0.7.2, Someday Jan 28, 2020

mikofski added 2 commits February 11, 2020 23:31

Merge branch 'master' into kimber_soil

c6c26ac

add test for kimber soiling model

671ff00

* also add a gallery example draft

add kimber to api.rst

55bd7c5

* use pd.to_datetime for plt * start to add manwash test using datetime

mikofski added 2 commits February 12, 2020 18:37

add test for manual wash

8988864

* add test data for manual wash * try to fix example plot using to_pydatetime

please fix kimber gallery example

a1372a6

mikofski marked this pull request as ready for review February 13, 2020 03:23

kandersolar mentioned this pull request Feb 13, 2020

Add calcparams_desoto+singlediode example to gallery #872

Merged

4 tasks

Fix indent on references

e174dba

kandersolar reviewed Feb 13, 2020

View reviewed changes

mikofski added 3 commits February 13, 2020 00:36

add test for no rain, max soil

f01d52f

* reuse rainfall events * combine zero soil assignments

fix typos in Kimber soiling model

04eb758

* correct Adrianne spelling * add table of soiling rates from figure 3 from Kimber paper * fix reference indents in docstring

update what's new w/ kimber model

ec26d8f

* fix table malformed table in losses * fix long line in docstring

cwhanse reviewed Feb 13, 2020

View reviewed changes

change soiling_rate to soiling_loss_rate

0d9ed77

* combine panels cleaned today with grace-period, since it includes today

kandersolar approved these changes Feb 13, 2020

View reviewed changes

pvlib/losses.py Outdated Show resolved Hide resolved

pvlib/losses.py Outdated Show resolved Hide resolved

pvlib/tests/test_losses.py Outdated Show resolved Hide resolved

mikofski and others added 3 commits February 13, 2020 11:49

add a 1-liner for soiling_kimber in the sphinx docs

4736027

* also fix missing "for" in note Co-Authored-By: Kevin Anderson <[email protected]>

separate tests for Kimber

e5d7a93

* remove extra whitespace

delete trailing whitespace in kimber test

c7b1c0e

cwhanse mentioned this pull request Feb 14, 2020

Snow coverage model from SAM #764

Merged

8 tasks

CameronTStark modified the milestones: Someday, 0.7.2 Feb 14, 2020

mikofski added 3 commits February 14, 2020 17:20

fix trailing whitespace in kimber

6acf512

Kimber needs pandas-0.22

75d6df6

CameronTStark approved these changes Feb 15, 2020

View reviewed changes

CameronTStark merged commit 64a6498 into pvlib:master Feb 15, 2020

mikofski deleted the kimber_soil branch February 15, 2020 15:18

wholmgren reviewed Feb 15, 2020

View reviewed changes

docs/examples/plot_greensboro_kimber_soiling.py Show resolved Hide resolved

wholmgren reviewed Feb 15, 2020

View reviewed changes

pvlib/tests/test_losses.py Show resolved Hide resolved

kandersolar mentioned this pull request Mar 2, 2020

Add function to retrieve example datasets #924

Open

cwhanse mentioned this pull request Aug 20, 2020

add Shading, Soiling, and Reflection Losses #61

Closed

	@requires_scipy
	def test_soiling_hsu(rainfall_input, expected_output_2):

add Kimber soiling model #860

add Kimber soiling model #860

Uh oh!

Conversation

mikofski commented Jan 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cwhanse commented Jan 15, 2020

Uh oh!

mikofski commented Jan 15, 2020

Uh oh!

Uh oh!

mikofski commented Feb 13, 2020

Uh oh!

kandersolar commented Feb 13, 2020

Uh oh!

mikofski commented Feb 13, 2020

Uh oh!

kandersolar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mikofski commented Feb 13, 2020

Uh oh!

cwhanse left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kandersolar commented Feb 13, 2020

Uh oh!

kandersolar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mikofski commented Feb 14, 2020

Uh oh!

mikofski commented Feb 14, 2020

Uh oh!

mikofski commented Feb 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cwhanse commented Feb 14, 2020

Uh oh!

mikofski commented Feb 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cwhanse commented Feb 14, 2020

Uh oh!

mikofski commented Feb 14, 2020

Uh oh!

mikofski commented Feb 15, 2020

Uh oh!

CameronTStark left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CameronTStark Feb 15, 2020

Choose a reason for hiding this comment

Uh oh!

mikofski Feb 15, 2020

mikofski commented Jan 15, 2020 •

edited

Loading

cwhanse left a comment •

edited

Loading

mikofski commented Feb 14, 2020 •

edited

Loading

mikofski commented Feb 14, 2020 •

edited

Loading

wholmgren Feb 15, 2020 •

edited

Loading

mikofski Feb 15, 2020 •

edited

Loading

wholmgren Feb 16, 2020 •

edited

Loading