Science Stories: Adventures in Bay-Delta Data

rss
  • January 27, 2023

By Rosemary Hartman

Delta Smelt

Small fish with large eye, small pectoral fin, and adipose fin.
Picture of Delta Smelt, photo by Rene Reyes, US Bureau of Reclamation

Wakasagi

Small fish with short pectoral fin, adipose fin, and upright dorsal fin. Very similar to the Delta Smelt.
Picture of a Wakasagi, photo by Rene Reyes, US Bureau of Reclamation

You’ve probably heard of Delta Smelt (Hypomesus transpacificus), and you may have heard of their cousin, the Longfin Smelt (Spirinchus thaleichthys), but there is a third osmerid in the estuary. The Wakasagi (Hypomesus nipponensis), also known as Japanese Smelt, is in the same genus as Delta Smelt, and was once thought to be the same species. It is native to Japan, but was introduced to reservoirs in California by the California Department of Fish and Game in the 1950s, and now it is established throughout the watershed, including the Delta.

But what does this brother of the Delta Smelt do? Is there sibling rivalry? A group of IEP scientists was curious, so they decided to look at all of our existing data to see when, where, how big, and how many Wakasagi are in the Delta and how their environmental tolerances and diet compares to Delta Smelt. A paper about their analysis recently came out in the Journal San Francisco Estuary and Watershed Sciences.

In order to compare Delta Smelt and Wakasagi, they looked at all the data from thirty different fish datasets from San Francisco Bay, Suisun Marsh/Suisun Bay, the Delta, and the watershed (see map below). This resulted in a dataset with over 250,000 individual Wakasagi! They also looked at data from special studies of Wakasagi and Delta Smelt growth and diets in the Yolo Bypass.

Map of the San Francisco Estuary showing hundreds of sampling points in the San Francisco Bay and Delta with scattered points upstream.
Maps of the San Francisco Bay-Delta Estuary. (A) Four long-term CDFW monitoring surveys and region assignments used for the comparative Delta Smelt and Wakasagi analysis. (B) Sampling locations for a subset of additional surveys used to assess Wakasagi catch, as well as Yolo Bypass surveys used to assess life-history traits including growth, phenology, and diet. Map reproduced from Davis et al, 2022, with permission.

They found some similarities between delta smelt and Wakasagi – both fish really like hanging out in the Sacramento Deep Water Ship Channel and both like eating calanoid copepods (a particularly tasty variety of zooplankton). They spawn at about the same time, but Wakasagi are usually a little earlier (though this varies from year to year), and Wakasagi usually grow a little faster. They are similar enough that they sometimes interbreed and produce hybrid offspring.

Wakasagi aren’t the same as Delta Smelt though. Wakasagi aren’t actually very common in the Delta, instead finding their homes further upstream in reservoirs (they especially seem to like the Feather River, the screw trap there catches tens to hundreds of thousands of Wakasagi per year!). In the Delta they are mostly in the northern region, which might be just them washing in from upstream. Though they were mostly found in freshwater reaches of the Delta, Wakasagi can actually tolerate a wider range of salinity and temperatures than Delta Smelt, but they seem to prefer cooler temperatures.

So, are Wakasagi competing with Delta Smelt for limited food resources? Maybe a bit, but while they play a similar ecological role when they do overlap, they don’t overlap spatially very often, and both Delta Smelt and Wakasagi are rare in the Delta. However, they overlap enough that areas that are good for Wakasagi are probably good for Delta Smelt too. Delta Smelt are becoming more and more endangered, so we can use Wakasagi as indicators of good Delta Smelt conditions and as substitutes for smelt in some laboratory experiments.

Major similarities and differences between Delta Smelt and Wakasagi
Delta Smelt Wakasagi Comparison
Annual life span Annual life span Checkbox that indicates the items are similar
Spawn later Spawn earlier Two arrows pointing in opposite directions that indicate the items are different
Eat calanoid copepods Eat calanoid copepods Checkbox that indicates the two items are similar
Grow slower Grow faster Two arrows pointing in opposite directions which indicates the items are different
Narrower tolerances Wider tolerances Two arrows pointing in opposite directions which indicate the items are different
Endangered More common Two arrows pointing in opposite directions which indicate the items are different
Native Non native Two arrows pointing in opposite directions which indicate the items are different
Mostly semi-anadromous Mostly freshwater Two arrows pointing in opposite directions which indicate the items are different
Small and silver Small and silver Checkbox which indicate the items are similar
Loves the North Delta Loves the North Delta Checkbox which indicates the items are similar
Smells like cucumber Smells like fish Two arrows pointing in opposite directions which indicates the items are different

 

Further reading

Categories: General
  • October 11, 2022

By Rosemary Hartman

With help from Arthur Barros and all the zooplankton taxonomists of the Stockton CDFW lab.

Photos by Tricia Bippus (CDFW)

Zooplankton never get as much appreciation as fish (Hartman et al. 2021), but even among zooplankton there are clear favorites. Copepods and mysid shrimp have dozens of publications dedicated to them, but rotifers often get the short end of the stick. Most papers about “zooplankton” in the San Francisco estuary don’t even mention rotifers. However, the Environmental Monitoring Program works very hard monitoring microzooplankton (guys smaller than 150 microns) and the expert taxonomists at CDFW’s Stockton laboratory spend hours counting and identifying rotifers in those samples. Rotifers are an important link in the food chain connecting bacteria, phytoplankton, and particulate organic matter to fish. They are eaten by larger zooplankton and larval fish (Plabbmann et al. 1997, Burris et al. 2022).

What is a rotifer anyway?

Rotifers are one of the simplest multi-cellular animals on earth, sometimes called "wheel animals" because they have a ciliated structure on their head that looks a little like a wheel. They are tiny, usually only half a millimeter long, and they eat phytoplankton and bits of organic material floating in the water.

How are the samples collected? Well, it starts with the field crew going out to long-term monitoring stations throughout the Delta. The crew lowers a pump nearly to the bottom, then raises the pump up slowly, sucking in water and zooplankton as it goes. The water is then passed through a 43-micron mesh net until 75 L of water have been filtered. All the critters in the net are carefully preserved in formalin, with a little bit of pink dye added to make the critters stand out better. See Kayfetz et al. (2020) (PDF) and the Zooplankton EDI publication metadata (Barros 2021b) for more information.

Back in the laboratory, trained taxonomists subsample the critters and carefully identify and count them under a microscope. Rotifers are tricky to identify, so most are only identified to the genus level, or lumped into “other rotifers”. The rotifers we see most frequently are:

Synchaeta spp.

  • Swimming form: top-shaped with pointed foot and lateral auricles with bristles at the widest point, bristles around corona.
  • Contracted form: roundish to donut-shaped with corona, auricles and foot sucked in. Not much clear space, organs more prominent than in Asplanchna.

Microscopic photo of synchaeta in both swimming and contracted form.

Synchaeta bichornis

  • Pointed ‘foot’ at posterior end, two ‘horns’ at the anterior end.
  • Body usually curved into a shallow “C” shape.

Polyarthra spp.

  • Body squarish with feather-like appendages at the “corners”.
  • Appendages extend beyond length of the body.

Keratella spp.

  • 6 prominent ‘teeth’ or hooks on the anterior margin. Posterior end variable, with zero, one, or two spines.
  • Rigid lorica.

Microscopic image of Keratella (rotifer).

Trichocerca spp.

  • Mostly cylindrical, more or less curved, tapering at the anterior and posterior ends.
  • Toes asymmetrical: one prominently elongated, filament-like, often held up ventrally.

Microscopic image of Trichocerca (rotifer).

Asplanchna spp.

  • Like a clear bag with few organs inside, more clear space than Synchaeta.
  • No ‘foot’. Contracted form with corona sucked in at one end.

“Other rotifers”

  • Including Branchionus, Playais, colonial rotifers, Notholca, Filinia, and many more!

Microscopic image of Brachionus and an unidentified rotifer

So, what can we learn from the rotifer data?

Well, we can start by graphing the average rotifer catch at all stations since the zooplankton survey began (Figure 1). The first thing that jumps out at you is that the standard deviation is HUGE! Rotifers (like all zooplankton) are highly variable critters with big changes from station to station, month to month, and year to year. The next thing that probably jumps out at you is that abundances were a LOT higher prior to 1980. What could have driven that decline?

Area plot of rotifer catch per unit effort by year from 1975-2021. There is a drop in catch around 1980.
Figure 1. Average catch per unit effort (number of rotifers per thousand cubic meters) of all rotifers per sample (dark green area). Standard deviation of catch per year (light green area).

But that is the average catch for ALL the rotifers lumped together. It might be interesting to look at each taxon individually (Figure 2). Here we can see that all species declined after 1980, but the biggest drops were seen Keratella, Polyarthra, and Trichocerca. Synchaeta didn’t show quite as big a drop. We can also see that Synchaeta is usually the most common taxa, while Asplanchna is pretty rare. Lots of other researchers have noticed a big drop in copepods and chlorophyll after 1986 when the invasive clam Potamocorbula amurensis started to take over the area (Kimmerer et al. 1994, Kimmerer and Thompson 2014, Kimmerer and Lougee 2015), but no one has looked at the post-1980 rotifer crash!

Bar plot of rotifer catch per unit effort by year for each of the six major rotifer taxa.
Figure 2. Catch of major species of rotifers caught by EMP over time. You can see that the abundance of many species of rotifers declined sharply around 1980. You can also see that Synchaeta, Keratella, and Polyarthra were the most common species.

Since 1980, the biggest years for rotifers were 2017 and 2011, both of which were really wet years. Maybe rotifers like wetter years better? Let’s subset our data so we just have data from after 1980 and see how water year time affects rotifer catch (Figure 3). The pattern isn’t super clear – all taxa had high catches in 2017, but not all wet years had high catches, and some taxa (like Asplanchna) also had high catches during drier years. However, when we graph the average total rotifer catch versus the Sacramento Valley Index (a measure of water availability), we see a positive correlation between water flow and rotifer catch (Figure 4). Why might this be? Are they getting moved in from upstream? Or are they reproducing faster?

Bar plot showing rotifer catch per unit effort by year with bars labeled with different water year types.
Figure 3. Catch per unit effort of each rotifer taxa over time, with bars color-coded with water year type.
Scatter plot showing rotifer catch per unit effort versus the Sacramento Valley Water Year index with a positive correlation.
Figure 4. Plot of total rotifer catch per unit effort versus Sacramento Valley Water Year index with different shapes and colors indicating water year type. The line indicates a linear model showing an increase in rotifer abundance with increased flow.

Of course, there are lots of different ways to display the data. We can make area plots, bar plots, streamflow plots, pie charts, maps, or pie charts on top of maps (Figure 5)! Different types of graphs help you see the data in different ways and pull out different patterns.

Map of the estuary showing rotifer abundance in different regions with pie charts.
Figure 5. Map of mean rotifer CPUE from 2017, which was one of the biggest years for rotifers since the 1970s. Each pie chart represents one of EMP’s long-term monitoring stations, with the size of the pie chart corresponding to the total rotifer abundance. The South Delta and Suisun Marsh stations were especially high in rotifers, with more Synchaeta in the Marsh and more Polyarthra and other rotifers in the South Delta.

Are you interested in finding more patterns in the data?

You can visualize the data yourself on the ZoopSynth Shiny app (which also lets you download the data). However, before you dig in, be sure to read all of the metadata available on the Zooplankton EDI publication. You can also read some of the most recent Status and Trends reports published in the IEP newsletter for more ideas about useful patterns waiting for you to discover (Barros 2021a). Feel free to reach out if you have any questions or find any cool patterns! We love talking about zooplankton. Consider sharing your findings with the Zooplankton PWT too!

References and further reading

Categories: Underappreciated data
  • August 29, 2022
any small crabs running around on a tray

More underappreciated data!

This is the second blog in our series on underutilized datasets from IEP.

San Francisco Bay Study’s Crab Catch dataset

Curated by Kathy Hieb and Jillian Burns

The San Francisco Bay Study has been sampling with otter trawls and midwater trawls throughout the San Francisco Bay, Suisun Bay, and Delta since 1980. Their fish data have been used in a number of scientific studies, regulatory decisions, and journal articles. However, did you know they measure and count crabs in their nets too?

Bay Study’s stations are all categorized as “Shoal” (shallow areas) or “Channel” (deeper samples). Crabs are collected by otter trawl, which is towed along the bottom of the water, scraping up whatever demersal fishes and invertebrates it comes across. Truth be told, it’s not the best way to catch crabs, because most crabs like hiding under rocks where they are out of the way of the net, but it does give us a metric of status and trends of some of the most common species of crabs, including the Pacific rock crab (Cancer productus), the graceful rock crab (Cancer gracilis, also known as the slender rock crab), the red rock crab, and everyone’s favorite, the Dungeness crab (Metacarcinus magister).

After the net has been towed on the bottom for five minutes, it’s brought on board the boat and the biologists count, measure, and sex the crabs they’ve caught (Figure 1). This can be tricky, because crabs can be FAST! Especially the smaller Dungeness crabs (Figure 2). The biologists have to be careful and pick up the crabs by their back side to avoid getting pinched by their claws, which definitely takes practice.

a large crab is held by the back of its shell and is being measured with calipers
Figure 1. Each crab is carefully measured using calipers. This is where experienced biologists have to practice holding the crabs carefully to avoid being pinched. Image credit, Lynn Takata, Delta Science Program.
tray full of several dozen small crabs
Figure 2. Lots of little crabs! Juvenile crabs can be particularly hard to catch, and particularly hard to tell apart. Image credit: Kathy Hieb, CDFW.

Once all the crabs are counted and measured, they are entered into a database that goes back to 1980. Bay's Study's Dungeness crab data have been used to help manage the commercial crab fishery because fisheries-independent data is valuable. From 1975 to 1978, an estimated 38-82% of the Dungeness crabs in the central California region rear in the San Francisco Estuary each year (Wilde and Tasto 1983). This dataset was also very helpful in tracking the introduction, expansion, and decline of the Chinese mitten crab (Eriocheir sinensis), which briefly took over the brackish regions of the estuary but declined as rapidly as it arrived (Figure 3. Rudnick et al 2003). Bay Study's crab data has also been combined with other datasets to see how the estuarine community as a whole responds to climate patterns and human impacts (Cloern et al. 2010).

line graph showing annual average catch per trawl of five species of crabs caught by Bay Study in each region of the Estuary (South Bay, Central Bay, Suisun, and the West Delta) - click to enlarge in new window
Figure 3. Annual mean catch per trawl of the most common species of crabs across each region of the estuary. Dungeness crabs are the most frequently caught, with peaks in South Bay, Central Bay, and San Pablo in 2013 and 2016. Chinese mitten crabs had a spike in abundance in Suisun and the West Delta around 2002, but are rarely caught before or after. The red rock crab, graceful rock crab, and Pacific rock crab are only caught in South Bay, San Pablo, and Central Bay, and then only in low abundances. Click image to enlarge.

However, a lot of questions remain to be asked of this dataset. Why did we see such high catch of Dungeness crabs in 2013 and 2016? What are the drivers between the lesser-studied crabs, such as the graceful rock crab? How does the salinity preference of each species of crab differ (Figure 4)? If you want to investigate these questions yourself, data are available on the CDFW file library website. But be careful, the data have a few hiccups in them, such as changes to sampling sites over time, missing samples during period of boat break-downs, and other caveats. Be sure to read the metadata and make sure you understand the data before using them.

dot plot showing the salinity at which each species of crab is caught - click to enlarge in new window
Figure 4. Dot plot showing the salinity of each trawl where each species was found from 1995-2005. The Pacific rock crab, graceful rock crab, and red rock crab mostly occur at high salinity (25-32 PSU), but the Dungeness crab is often found in brackish water (10-32 PSU), and the Chinese Mitten crab was found in fresh to brackish water and mostly absent from high salinity water (anything greater than 28 PSU). Click image to enlarge.

Further reading

Categories: BlogDataScience, Underappreciated data
  • August 16, 2022

Some data just needs a little love

IEP collects a lot of data. Most people who work in the estuary have probably heard of FMWT’s Delta Smelt Index, or the Chipps Island salmon trawl, or the EMP zooplankton survey. But those “big name” surveys are only part of what we do at IEP! This is the first blog post in a series on “underappreciated” datasets where we highlight some of the data you might not be familiar with.

Yolo Bypass Fish Monitoring Program’s Drift Invertebrate survey

By Nicole Kwan, Brian Schreier, and Rosemary Hartman

In most of the estuary, we concentrate on invertebrates and other fish food that live under the water. However, in streams and rivers the contribution of terrestrial invertebrates falling into the water from surrounding vegetation and aquatic insects that ‘hatch’ on the surface of the water to metamorphose into their terrestrial adult form are also important food sources for fish, particularly Chinook Salmon and Sacramento Splittail. The Yolo Bypass, a large managed floodplain near Sacramento, is located on the boundary between the estuary and the river. As such, the Yolo Bypass Fish Monitoring Program (YBFMP) tracks both aquatic zooplankton and terrestrial drift invertebrates.

The YBFMP collects drift invertebrates year-round from two sites to compare the seasonal variations in densities and species trends of aquatic and terrestrial invertebrates between the Sacramento River and the Yolo Bypass. The crew piles into a boat and heads out, then tows a rectangular net that sits half-in, half-out of the water for ten minutes along the surface. Sometimes, when flows are really high, they can simply hold the net out on the side of their fish trap for ten minutes and let the water flow through it instead of towing it (Figure 1). The crew then rinses the sample into a bottle, preserves it with formalin, and sends it to a contracted lab for identification and enumeration (counting all the bugs under a microscope).

A woman in a life jacket stands on the deck of a screw trap with a rectangular net held in the flow at the surface of the water.
Figure 1. YBFMP scientist Anji Shakya sampling drift invertebrates in high flows next to the fish trap. Image credit - Naoaki Ikemiyagi Department of Water Resources.

There are a lot of interesting questions we can ask with these data, such as, what time of year do we catch the most chironomids (midges) (Figure 2)? Or, how does community composition and abundance differ between the Sacramento River and Yolo Bypass (Figure 3), and how does that relate to differences in hydrology and water quality?

A scatter plot of chironomid catch in the Sacramento River and Yolo Bypass with a trend line showing higher abundances in the spring in the Yolo Bypass and higher abundances in the summer in the Sacramento River. Sampling in the summer did not occur until more recent years (after 2010) - click to view image in new window
Figure 2. Log-transformed catch-per-unit-effort of chironomid midges caught in drift net samples in the Yolo Bypass Toe Drain and Sacramento River at Sherwood Harbor. Note the abundances of chironomids in the spring on the Yolo Bypass. The Bypass tends to have higher abundances than the Sacramento River in the spring, but lower abundances in the summer. Sampling in summer and fall only started in more recent years. Click on image to enlarge.

Stacked bar plot showing abundance and community composition of invertebrates collected in the drift net in the Sacramento River and Yolo Bypass by year. Insects are the most common group in all years and both sites. Gastropods are the second most common group in the Yolo Bypass, whereas oligocheates in the order clitellata are the second most common in the Sacramento RIver.  Abundances on the Sacramento River are usually about 25% of abundances on the Yolo Bypass - click to view image in new window
Figure 3. Catch per unit effort of organisms in the drift net categorized by taxonomic order and plotted over time. Insects dominate both the River and the Bypass samples, but the Bypass has consistently higher abundance of drift invertebrates. Click on image to enlarge.

One particularly unexpected thing we’ve seen in the data is high abundances of snails in the samples. Snails normally live on the bottom of the water or on vegetation, so seeing them floating on the surface was surprising. We see a lot of variation in snail abundances between years, and we’re not sure why (Figure 4). The wet years of 2017 and 2019 had particularly high snail catch, but other wet years weren’t similar. A fun mystery for someone to investigate!

Bar graph with large standard error bars showing snail catch by year and water year type (average, wet, or dry) - click to view image in new window
Figure 4. Mean (+/- one standard error) CPUE of snails (class Gastropoda) in drift net samples from the Yolo Bypass. Water year classes (Wet - W, Dry - D, or Average - A) is noted with letters under each bar. Notice how snail catch was very high during the wet years of 2017 and 2019, but also during the dry year of 2013 and the average year of 2003. Click on image to enlarge.

If you want to check out this data for yourself, it has been published on the EDI data repository and will be updated regularly. However, keep in mind that sample frequency, contracting labs, and methods have changed slightly over time. Be sure to read the metadata so you fully understand the data before using it. If you have any questions, just reach out! We’re nice people and we love talking about our data and helping others use it.

Further Reading

Categories: Underappreciated data
  • July 5, 2022

Authors: Rosemary Hartman and the IEP Data Utilization Work Group

Here at IEP, we collect a lot of data, and we do a lot of science. However, people haven’t always realized how much data we collect because it hasn’t always been easy to find. For scientists that were able to find the data, sometimes it was difficult to understand or it was shared in a hard-to-use format. That’s why IEP’s Data Utilization Work Group (DUWG) has been pushing for more Open Science practices over the past five years to make our data more F.A.I.R (Findable, Accessible, Interoperable, and Reusable). And wow! We’ve come a long way in a short time.

A staircase with FAIR Principles written on it and stick figures climbing it. Circles are around the staircase.  One shows a map pin that says Persistent and Findable. One shows an open lock that says 'Accessible' with meaningful interaction. One shows a person and a puzzle and says 'Reusable with Full Disclosure', and one shows two computers with a line between them and says 'Interoperable'.

This image was created by Scriberia for The Turing Way community and is used under a CC-BY license. DOI: 10.5281/zenodo.3332807

What is Open Science anyway? Well, I was going to call it “the cool-kids club” but really it’s the opposite of a club! It’s the anti-club that makes sure everyone has access to science – no membership required. Open science means that all scientists communicate in a transparent, reproducible way, with open-access publications, freely shared data, open-source software, and openness to diversity of knowledge. Open science encourages collaboration and breaks down silos between researchers – so it’s a natural fit for a 9-member collaborative organization like IEP.

For IEP, the ‘open data’ component is where we’ve really been making strides. While “share your data freely” sounds easy, it’s actually taken a lot of work to make our data FAIR. As government entities, theoretically all of the data we collect is held in the public trust, but putting data in a format that other people can use is not simple. Here are some of the things we have done to make IEP data more open:

Data Management Plans

The first thing the DUWG did was get all IEP projects to fill out a simple, 2-page data management plan outlining what was being done with the data in short, clear sections:

  • Who: Principal investigator and point of contact for the data.
  • What: Description of data to be collected and any related data that will be incorporated into the analysis.
  • Metadata: How the metadata will be generated and where it will be made available.
  • Format: What format the data will be stored in and what format it will be shared in, which may not be the same. For example, you may store data in an Access database but share it in non-proprietary .csv formats.
  • Storage and Backup: Where you will put the data as you are collecting it and how it will be backed-up for easy recovery. This is about short-term storage.
  • Archiving and Preservation: This is about long-term storage to keep your data for someone years down the line. This is best done with publication on a data archive platform, such as the Environmental Data Initiative (EDI).
  • Quality Assurance: Brief description of Quality Assurance and Quality Control (QAQC) procedures and where a data user can access full QAQC documentation.
  • Access and Sharing: How can users find your data? Is it posted on line or by request? Are there any restrictions on how the data can be used or shared? 

You can find instructions (PDF) and a template (PDF) for Data Management Plans on the DUWG page. All of IEP’s data management plans are also posted on the IEP website.

Data Publication

Many IEP agencies were already sharing data on agency websites, but most of this was done without formal version control, machine-readable metadata, or digital object identifiers (DOIs), making it difficult to track how data were being used. Now IEP is recommending publishing data on EDI or other data archives. Datasets now have robust metadata, open-source data formats (like .csv tables instead of Microsoft Access databases), and DOIs for each version so studies using these data can be reproduced easily.

Cartoons of stick people illustrating the phases of the data life cycle with arrows connecting them. Data collection - People with nets catching shapes.  Data processing - people take shapes out of a box labeled short-term storage and lay them out on a table. Data Study and Analysis - people make patterns with the shapes. Data publishing and access - People present the data to an audience. Data Preservation - People put shapes in tubes and boxes. Data re-use - people open tubes and a string of shapes come out. Research ideas - Shapes inside a light bulb.

This image was created by Scriberia for The Turing Way community and is used under a CC-BY license. DOI: 10.5281/zenodo.3332807

Metadata Standards

The term “metadata” can mean different things to different people. Some people may think it simply means the definitions of all the columns in a data set. Some people may think it means a history of changes to the sampling program. Some people think it’s your standard operating procedures. Some people may think it means data about social media networks. What is it? Well, it’s the “who, what, where, when, why, and how” of your data set. It should include everything a data user needs to understand your data as well as you do. The DUWG developed a template for metadata that includes everything we think you should include in full documentation for a dataset. Some of it might not apply to every dataset, but it is a good checklist to get you started.

You can find the Metadata template (PDF) on the DUWG page.

QAQC standards

The DUWG is just starting to dig into QAQC. Quality assurance is an integrated system of management activities to prevent errors in your data, while quality control is a system of technical activities to find errors in your data. QAQC systems have become standard practice in analytical labs, but the formalization and standardization of QAQC practices is new for a lot of the fish-and-bug-counters at IEP. The DUWG QAQC sub-team developed a template for Standard Operating Procedures (PDF), and is working to provide guidance for QAQC of all types of data, and for integrating QAQC into all sampling programs. This promotes consistency across time, people, and space, increases transparency, and gives users more confidence in your data.

Dataset integration

One of the great things about laying down the framework for open data that includes data publication, documentation, and quality control is that it then becomes much easier to integrate datasets across programs. The IEP synthesis team (spearheaded by Sam Bashevkin of the Delta Science Program) has developed several integrated datasets that pull publicly accessible data, put them in a standard format, and publish them in a single, easy-to-use format.

Spreading the Word

We’re also making sure EVERYONE knows about how great our data are.

  • We’ve revamped the data access webpage on our IEP site.
  • Publishing data on EDI makes it available on DataOne, which allows searches across multiple platforms.
  • Publishing data papers is a relatively new way to let people know about a dataset. For example, this zooplankton data paper was recently published in PLOSOne.
  • We’ve made presentations at the Water Data Science Symposium and other scientific meetings.
  • We published an Open Data Framework Essay in San Francisco Estuary and Watershed Sciences.
  • We also put on a Data Management Showcase (video) that you can watch via the Department of Water Resources YouTube Channel.
  • Plus, we have lots more data management resources available on the DUWG website.

Together, we're putting IEP Data on the Open Science Train to global recognition. 

Questions? Feel free to reach out to the DUWG co-chairs: Rosemary Hartman and Dave Bosworth. If you have any suggestions for improving data management or sharing, we want to hear about it.

Two birds are in a fountain labeled Fountain of Open Data. One asks: You mind if I reuse this data? The other says: Go ahead! we can even work together on it.

This image was created by Scriberia for The Turing Way community and is used under a CC-BY license. DOI: 10.5281/zenodo.3332807

Further Reading

Categories: BlogDataScience