Data Migration

Data Migration

Data migration is the process of transferring data between storage types, formats, or computer systems.

 

When is data migration performed

  • Every time you import newly acquired data into DataSight you are migrating data from one format to another.
  • The process of gathering your historical data into DataSight is also data migration.

 

Challenges in migrating historical data into DataSight

To get the maximum benefit from DataSight, you need to be certain that the information in your database is reliable, complete and has a consistent set of definitions. This may not be easily achieved for historical data sets. As you need to standardise your data definitions to give each piece of information the same meaning, combining multiple data sources and migrating data across systems (data migration and database migration) requires sufficient time and effort.

The following may pose challenges during the migration of your data into DataSight:

  • You have had many different staff using a historical database over time. Data entry may have been inconsistent over time and the database has duplicate entries or gaps.
  • You have acquired other systems that were merged with your own database. Data has often been loaded as-is, rather than being accurately translated and imported.
  • You operate more than one database systems, exacerbating the issues raised above.
  • Old operational systems have been migrated into new, with new data definitions put in place at the time of implementation. However, data from legacy systems still remain in the same format.

You need to be aware of the limitations of your historical data sets and be realistic about the time that may be required to move this set into DataSight. Please refer to the Seveno website and the Knowledge Base for help with known data migration issues.

 

Migrate your historical data into DataSight

To successfully migrate data into DataSight, you need to design a process for data extraction and data loading, which relates your old data formats to DataSight's formats and requirements. The process of data migration will consist of firstly auditing your files, then developing the rules to standardise data definitions.

 

  1. Standardise your data:

You may wish to ensure your data fields line up with your DataSight levels and variables, particularly if data is taken from spreadsheets, and that the variables are uniform, with any value ranges standardised across all records. This prevents data misinterpretation, enables more accurate selection from lists and helps identify gaps in your data. You may need to remove duplication within the database. Data migration phases (design, extraction, cleansing, load, verification) are commonly repeated several times before you can be confident in the integrity of your historical data set.

 

  1. Streamline your migration:

Certain functions in DataSight can and should be used to help streamline data migration. The importation process allows for a pre-load 'data validation' step, where you interrogate the data to be transferred to ensure that it fully complies with your database structure. Any issues with the data importation occurring at the point of loading are automatically reported in the import log.

 

  1. Verify your data:

After loading into DataSight, your results can be subjected to data verification procedures such as flagging to determine whether data was accurately translated and is complete.

 

For time series data, DataSight offers resolution as low as one second. Learn more about Time Limitation, Raw Data File Formats, Depth in Water Bodies and Duplicates.

 

See also:

  • Input Your Data
  • Import Routine
  • Manual Data Entry
  • Automated Import

 

Time Limitation

In DataSight Version 3, the minimum time increment has been set to 1 second.

While the Microsoft SQL can currently store data captured to less than 3 milliseconds apart, DataSight is not currently designed to take this data. If you require sub second data to be stored in DataSight, please contact us.

Raw Data File Formats

At present DataSight cannot import multiple data records which are stored as one continuous row of data. You may need to preprocess the data using Excel to break the data into rows.

Depth in Water Bodies

Certain types of environmental data may have records for seemingly the same date/time at one specific location. For example, if you are monitoring changes in water temperature at one site, but at various depths, the location and date will remain constant. This is fine, but each record must have a unique timestamp. In this instance it is imperative that the time changes for each of the depth entries. This can be as simple as varying each entry (depth) by minute, such as 19:50:00, 19:51:00, 19:52:00, and so on...

There are two options when you are faced with data such as this:

Option 1. Ensure time values are assigned correctly before importing. This may involve adjusting equipment settings to record separate timestamps for each sample, or editing the data manually before importing in DataSight.

Option 2. DataSight can assign time values during import (see Map Levels).

Note

DataSight is designed for each measurement to have a unique timestamp, as we believe that even when a replicate measurements for a sample, that measurement have been CONDUCTED at a different time and can be differentiated by this. Think about how you are recording your measurements with respect to time, to help resolve issues as described above.

Duplicates

In DataSight, data for the same variable at a given site are stored with unique date and time stamps on DataSight. But when capturing environmental data, you may take duplicate measurements from a given locality or analyse a sample multiple times to obtain a statistically representative value for your measurement. At present, to save these records on DataSight for scientific interrogation, you may enter such duplicate data with differing or unique time stamps. We recommend using a small time increment between each entry (e.g. one second). This can be done during Import in Step 4, Map Fields. It is also recommended that you enter the sample name or code against the timestamps in a sample number variable to be able to filter and identify data for a given sample number or event.

 

    • Related Articles

    • Data Input

      Importing or entering data in to DataSight is the first step in managing your data. Data input functions are highly configurable and as such allow you to: Collect and store data from a variety of sources, such as data loggers, LIMS and manual data ...
    • Data Import

      Data Import Checklist Before importing data, you should be able to answer the following questions about the current format of your file. What type of file are you importing? Is the data normalised or pivoted? Do the data variables you wish to import ...
    • Manual Data Entry (MDE)

      Data can be manually entered record by record into DataSight. Manual data entry is most useful where paper records are the source of your data, or you have discrete data for a rare event. This type of entry requires existing Levels and Collection ...
    • Import Data

      Open a desired Station Table. Click Import Data. Select Manual Data Entry or Import Data File. Manual Data Entry Enter date, time and your variable value. Click Save. Import Data File Browse to your file and select Open, then click Upload. Configure ...
    • Latest Data

      The LATEST DATA tab displays all data available to DSApp for the Level selected from either the Map or Select Level tabs. The selected Level 1, Level 2 and Level 3 is displayed at the top of the LATEST DATA tab, followed by the smart devices' current ...