Package 'wxgenR'

Title: A Stochastic Weather Generator with Seasonality
Description: A weather generator to simulate precipitation and temperature for regions with seasonality. Users input training data containing precipitation, temperature, and seasonality (up to 26 seasons). Including weather season as a training variable allows users to explore the effects of potential changes in season duration as well as average start- and end-time dates due to phenomena like climate change. Data for training should be a single time series but can originate from station data, basin averages, grid cells, etc. Bearup, L., Gangopadhyay, S., & Mikkelson, K. (2021). "Hydroclimate Analysis Lower Santa Cruz River Basin Study (Technical Memorandum No ENV-2020-056)." Bureau of Reclamation. Gangopadhyay, S., Bearup, L. A., Verdin, A., Pruitt, T., Halper, E., & Shamir, E. (2019, December 1). "A collaborative stochastic weather generator for climate impacts assessment in the Lower Santa Cruz River Basin, Arizona." Fall Meeting 2019, American Geophysical Union. <https://ui.adsabs.harvard.edu/abs/2019AGUFMGC41G1267G>.
Authors: Subhrendu Gangopadhyay [aut], Lindsay Bearup [aut], David Woodson [aut, cre], Marketa McGuire [aut], Andrew Verdin [aut], Eylon Shamir [aut], Eve Halper [aut]
Maintainer: David Woodson <[email protected]>
License: CC0
Version: 1.4.2
Built: 2024-11-02 04:03:53 UTC
Source: https://github.com/cran/wxgenR

Help Index


Example meteorological training data for weather generator

Description

Weather data (precipitation, temperature, and season) measured at the NWS station (GHCND:USC00440766) in Blacksburg, Virginia.

Usage

data(BlacksburgVA)

Format

A data frame.

Source

Blacksburg, VA NWS office

Examples

data(BlacksburgVA)

Get dates in window

Description

Find grouping of dates around each Julian day of year (1-366) based on the window you set. The start and end years for this function should include at least one leap year (i.e., the record should be at least 4-years in length), or else the function will return non-existing dates (February 29th during non-leap years).

Setting leapflag to true will set February 29th as NA for non-leap years.

Setting leapflag to false will remove February 29th for non-leap years (recommended).

The 'wwidth' variable is the semi-bandwidth that sets the window size to search for adjacent days. Given a value of 'wwidth', the window size will be 2*wwidth + 1. For example a 'wwidth' of 7 would give a window size of 2*7+1 = 15.

Other applications of this function might include a daily bias correction approach where it is necessary to find N adjacent days for each day of year in order to train the bias correction algorithm.

Usage

getDatesInWindow(syr, eyr, smo, emo, sdate, edate, wwidth, leapflag = FALSE)

Arguments

syr

Start year.

eyr

End year.

smo

Start month.

emo

End month.

sdate

Start date.

edate

End date.

wwidth

Window set for finding surrounding days (semi-bandwidth).

leapflag

Set index for leap years (default = F).

Value

Returns a matrix with 366 rows (one for each Julian day of year, including leap days) and nCols; where nCols = (2 x wwidth + 1) x (eyr - syr + 1). Each row is specific to a certain Julian day (e.g., day 1) and contains the preceding and antecedent dates around that Julian day based on the window length you set. The dates will be fetched for each year in the range you set between the start and ending years (inclusive of the start and end years). Matrix values are either dates formatted as 'yyyymmdd' or NA values.

Examples

getDatesInWindow(syr = 2000, eyr = 2005, smo = 10, emo = 09,
  sdate = 20001001, edate = 20050930, wwidth = 3, leapflag = FALSE)

Example meteorological training data for weather generator

Description

Weather data (precipitation, temperature, and season) for the Lower Santa Cruz River Basin in Southern Arizona. Dataset was developed for the Hydroclimate Analysis within Reclamation's Lower Santa Cruz River Basin Study.

Usage

data(LowerSantaCruzRiverBasinAZ)

Format

A data frame

Source

Hydroclimate Analysis - Lower Santa Cruz River Basin Study

Examples

data(LowerSantaCruzRiverBasinAZ)

Random variates from the Epanechnikov kernel

Description

Simulate outside the historical envelope using randomly generated values from the Epanechnikov kernel (via acceptance-rejection sampling).

For more details on the Epanechnikov kernel and its use in a weather generator, see Rajagopalan et al. (1996).

Usage

repan(nsim)

Arguments

nsim

Number of simulations.

Value

Returns a vector of random variates sampled from the Epanechnikov kernel. 'nsim' number of samples are returned.

References

Rajagopalan, B., Lall, U., & Tarboton, D. G. (1996). Nonhomogeneous Markov Model for Daily Precipitation. Journal of Hydrologic Engineering, 1(1), 33–40. https://doi.org/10.1061/(ASCE)1084-0699(1996)1:1(33)

Examples

repan(nsim = 10)

 #simulate and plot density and distribution function
 oldpar = par(mfrow=c(1,3), mar=c(2,2.5,2,1),
              oma=c(2,2,0,0), mgp=c(2,1,0), cex.axis=0.8)

 par(mfrow=c(1,2))
 nsim=1e5
 x <- sort(repan(nsim));y=0.75*(1-x^2)
 plot(x,y,xlab="x",ylab="f(x)",type="l",lwd=2)
 grid()
 title (main="Epanechnikov PDF",cex.main=0.8)
 F=rank(x)/(nsim+1)
 plot(x,F,ylab="F(x)",type="l",lwd=2)
 grid()
 title (main="Epanechnikov CDF",cex.main=0.8)

 dev.off()

 par(oldpar)

Select transition state

Description

Function selects and returns the transition state given a uniform random number between 0 and 1 and the cumulative probability vector of the state sequence.

Usage

selectState(uni, wt)

Arguments

uni

Uniform random number between 0 and 1.

wt

Cumulative probability vector of states.

Value

Returns an object containing the transition state(s) based on the given cumulative probability vector and random numbers.

Examples

rand = runif(1)

 print(rand)

 selectState(uni = rand, wt = c(0.25, .55, 0.85, 1))

Spell length calculation

Description

Function to calculate the length (duration in years) of wet or dry periods.

Usage

spellLengths(s)

Arguments

s

A binary vector of 0 dry and 1 wet only.

Value

Returns a list object containing a vector of dry spell lengths and a vector of wet spell lengths.

Examples

#use 0 for dry and 1 for wet years
 spells = c(0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0)

 spellLengths(spells)

Meteorological training data for weather generator from nine stations

Description

* Weather data (precipitation, temperature) from nine stations within the Global Historical Climatology Network (GHCN): + USC00162534 (DONALDSONVILLE 4 SW LA, US) + USC00050372 (ASPEN 1 SW CO, US) + USC00080236 (ARCHBOLD BIO STATION FL, US) + USC00440766 (BLACKSBURG NWS VA, US) + USW00014606 (BANGOR INTL. AIRPORT ME, US) + USW00094240 (QUILLAYUTE AIRPORT WA, US) + USC00346386 (NORMAN 3 SSE OK, US) + USC00028795 (TUCSON 17 NW AZ, US) + USC00472314 (EAGLE RIVER WI, US)

Usage

data(stationData)

Format

A data frame.

Source

Global Historical Climatology Network

Examples

data(stationData)

Write simulations to file

Description

Write simulation results to .csv files (one .csv file is generated for each trace). Inputs include the weather simulations stored in the list object output from the 'wx()' function as well as the 'nsim' and 'nrealz' variables that were inputs to the 'wx()' function.

A debug flag allows for more detailed reports (debug = TRUE), but setting 'debug = FALSE' is generally recommended for more concise output. Keeping 'debug = FALSE' will also include a simulation time stamp (year, month, day) beginning in year 1.

This function will write the .csv files to your working directory.

Leap years may be included in the simulated weather if they are included in your training data, so non-leap years include a row of 'NA' values at the end of the calendar year as a book-keeping measure so that the total number of rows in each trace is the same.

Usage

writeSim(wxOutput, nsim, nrealz, path = NULL, debug = FALSE)

Arguments

wxOutput

Weather simulations output from 'wx()' function.

nsim

Number of simulation years.

nrealz

Number of realizations (ensemble size).

path

Specified path to where simulation output shall be written. Defaults to current working directory (path = NULL). Specified path should be a character string of the folder location ending with '/'.

debug

Option to include additional variables in the .csv file outputs for debugging and advanced analysis. Includes sampling date, etc. Default = FALSE (off). If debug is off, the weather simulations will have a simulation year time stamp (beginning in year 1) as well as month and day time stamps.

Value

No return value, called to write simulation results to file.

Examples

z = wx(trainingData = LowerSantaCruzRiverBasinAZ,
 eyr = 1990, nsim = 5, nrealz = 5, aseed = 23,
  wwidth = 3, unitSystem = "U.S. Customary",
   ekflag = TRUE, awinFlag = TRUE, tempPerturb = TRUE,
    pcpOccFlag = FALSE, numbCores = NULL)


writeSim(wxOutput = z, nsim = 5, nrealz = 5, path = paste0(tempdir(), "/"), debug = FALSE)

Runs weather generator

Description

Runs the weather generator based on user inputs.

Your input/training data MUST have the following variables, in this order: year, month, day, prcp, temp, season. These variables are case sensitive and must be spelled as specified here.

Your training data should start at the beginning of the calendar year (January 1) as the weather simulator is designed for the full calendar year.
Use starting- and ending- years to subset your input data if desired; otherwise starting and ending dates will default to the beginning and end of your dataset.

Using 'ekflag = T' will generate simulations outside of the historical envelope via an Epanechnikov kernel. For more details on the Epanechnikov kernel and its use in a weather generator, see Rajagopalan et al. (1996).


Leap years may be included in the simulated weather if they are included in your training data, so non-leap years include a row of 'NA' values at the end of the calendar year as a book-keeping measure so that the total number of rows in each trace is the same.

The weather generator can handle missing precipitation and temperature data if it is marked as 'NA' in your training data. It will set 'NA' precipitation values to 0 and pass along 'NA' temperature values if that date is sampled for the simulations. Consider replacing any missing data with monthly or daily averages to avoid 'NA' values in your simulated weather.

Usage

wx(
  trainingData,
  syr = NULL,
  eyr = NULL,
  smo = NULL,
  emo = NULL,
  nsim,
  nrealz,
  aseed,
  wwidth,
  unitSystem,
  ekflag,
  awinFlag,
  tempPerturb,
  pcpOccFlag = FALSE,
  traceThreshold = 0.005,
  numbCores = NULL
)

Arguments

trainingData

Either a matrix, dataframe, or path to a .csv file with the following variables is required: year, month, day, prcp (daily precipitation), temp (daily temperature), and season (1, 2, ..., N, for N seasons - up to 26 seasons will work but seasons need to be defined in a meaningful way). Units must be either U.S. Customary (inches, degrees F) or metric (mm, degrees C) and must be specified with the 'unitSystem' input variable. Input data can be station-based, basin averages, grid cells, etc. Input data MUST have these variables: year, month, day, prcp, temp, season.

syr

Optional: subset training data to specific start year (defaults to beginning of training data). Subset will begin on the first day available in 'syr'.

eyr

Optional: subset training data to specific end year (defaults to end of training data). Subset will end on the last day available in 'eyr'.

smo

Training data start month (you can also use to subset your training data).

emo

Training data end month (you can also use to subset your training data).

nsim

Number of simulation years.

nrealz

Number of realizations or traces (i.e., ensemble size).

aseed

Specify a seed for reproducibility.

wwidth

Set the sampling window for each day of year, a lower value for 'wwidth' will sample fewer surrounding days (lower variability) and a higher value will sample more days (higher variability). Typical setting of 'wwidth' is between 2 and 15, resulting in a daily sampling window of 5 days and 31 days, respectively. Can either be a single number for a uniform window width through the year, or a vector of window widths specific to each season in the training data. In the case of variable window widths, the number of window widths should be equal to the number of seasons.

unitSystem

Specify the unit system of your training data. Input a string that is either "U.S. Customary" or "Metric". U.S. Customary corresponds to inches and degrees Fahrenheit, while Metric corresponds to millimeter and degrees Celsius. If Metric is specified, units will automatically be converted to U.S. Customary for weather simulation, then re-converted to Metric for results output.

ekflag

Simulate outside historical envelope using an Epanechnikov kernel? (T/F)

awinFlag

Set to T or TRUE if you would like to see the results of the adaptive window width. If only one or zero precipitation values (>0.01 inches) are found within the initial window width you set from a day where precipitation occurred, it will be iteratively increased until two or more precipitation values are found. By default, the results are not shown.

tempPerturb

Set to T or TRUE if you would like to add random noise to the temperature simulations based on a normal distribution fit on the training data.

pcpOccFlag

Set to TRUE to use precipitation occurrence as a variable in the temperature simulation model or set to FALSE to omit precipitation occurrence as a variable. Simulated daily temperature uses concurrent daily precipitation occurrence as a variable if enabled. By default, this is turned off.

traceThreshold

Threshold for determining whether precipitation depth is considered a trace amount or not. Precipitation depths below this value will be considered trace amounts and will not be used for simulation. A default value of 0.005-inches is used based on National Weather Service guidance. If using a custom trace depth, ensure that it is in the same unit system as your training data and specified by the 'unitSystem' flag.

numbCores

Enable parallel computing for precipitation simulation, set number of cores to enable (must be a positive integer greater than or equal to 2). Turned off by default; if set to 0 or 1 it will run as single thread. Use function 'detectCores()' from 'parallel' package to show the number of available cores on your machine.

Value

Returns a list containing both inputs to the weather generator as well as outputs.

  • dat.d - User inputs to weather generator, saved for future use.

  • simyr1 - The years sampled for each trace.

  • X - The simulated daily dry/wet sequences for each trace (0 = dry, 1 = wet).

  • Xseas - The simulated season by day for each trace.

  • Xpdate - If precipitation was simulated to occur on a given day, this is the date from which historical precipitation is sampled.

  • Xpamt - The simulated daily precipitation depth.

  • Xtemp - The simulated daily mean temperature.

Examples

data(LowerSantaCruzRiverBasinAZ)

head(LowerSantaCruzRiverBasinAZ)

#No input for `syr` because we want the training period to begin at the beginning of the data
#record (1970), but set `eyr` = 1990 because we want to subset training period to end in 1990.

wx(trainingData = LowerSantaCruzRiverBasinAZ,
 eyr = 1990, nsim = 3, nrealz = 3, aseed = 23,
  wwidth = 3, unitSystem = "U.S. Customary",
   ekflag = TRUE, awinFlag = TRUE, tempPerturb = TRUE,
    pcpOccFlag = FALSE, numbCores = NULL)

wxgenR package

Description

A weather generator with seasonality