Creating an Initial Database
Database consists of fields containing individual pieces of data grouped in tables. When creating a database, one determines characteristics of specific data entry. Forms are used for viewing data entry or fields associated to forms. A question creates new tables from existing tables based on the question type. The data formed is then organized in a report according to your requirements. To create a database, chose Database from the menu bar following the procedure, File > New > Database (Neuman, 2003). One can also access new database by selecting the arrow next to New Icon on the standard tool bar, and from the dropdown menu, select Database. All the processes will open up a new data base. To create a database, you should have a recommended server with the recommended files.
Data editing consists of removal of bad data by muting. It is a simple method of removing noise when high-amplitude unpredictable events are isolated. Data editing consists of unpredictable events since they leave high amplitude residuals that tend to be expected as prediction errors. In recent times, automatic data editing is used for three dimensional surveys because huge volumes of data used. The huge volumes of data have led to the use of automatic data editing. Large dynamic range of data allows for more accuracy in the desired signal. It is noted that during data collection, high amplitude noise is also collected. The noise is as a result of unwanted signals i.e. ground roll or imperfections of recording instruments. Data processing techniques tend to preserve the amplitudes and since data should be free of noise, clean pre-stack data is recommended. Processes such as pre-stack migration, velocity determination and AVO measurements are therefore recommended. Data editing happens at two levels, micro and macro editing. Micro editing corrects data at the record level. At this level, error in data is checked in records with the intention of determining the consistency of the data. Macro editing detects errors but after the analysis of aggregate data (Dennard, 2000). Editing is done at different levels such as validity, range, duplication, consistency, historical, statistical and miscellaneous edits.
It involves data analysis and services delivery in a work overloaded with information. It entails knowing statistics (measure obtained in a sample) and different terminologies used in describing data. Such terminologies that include, parameter which is the characteristics obtained in a population. Mean is the sum of the values divided by the number of values and denoted by x (Ercan et al, 2007). Median is the mid point of data after being sorted in ascending or descending order. Mode- is the most frequent number. Skewed Distribution-position of the high value of data, values can be lying towards the head or towards the tail. Weighted mean-mean found when each value is multiplied by its weight and totaled. Symmetric distribution- even or uneven distribution of data values on either side of the mean. Midrange- mean of the lowest value and the highest values. Range-the difference between the highest value and the lowest value, other terms used in describing data include, population Variance, sample Variance, standard deviation, coefficient of variation, empirical or normal rule, standard score or Z-score, percentile, quartile, deciles, lower hinge, upper hinge, box plot, five number summary, inter-quartile range, outlier, mild outliers and extreme outliers (Eriksson et al, 2006).
Data manipulation also refers to as data fudging. It includes making up false data and selecting data for reporting. Examples of selective reporting include, choosing a group of results for final tallying following a pattern that is consistent to the preferred hypothesis. In doing this, one will have to ignore other values, results and data runs contradicting the hypothesis. Positive result entails data runs where the researcher guesses a hidden card or variable at a greater frequency than chance (Bernard, 2000). Manipulation here therefore involves a case where the hypothesis is not confirmed by the totality of the entire research experiment. It is only confirmed by selected group of successful test. In the event of manipulation study results cannot be reproduced by another investigation.
Matching Data Variables with Analytical Methods
Data variable is a quantity measured on a continuous and infinite scale. It is not measured on a distinct unit demanding yes/no answers. The variables are put down on charts, range charts and standard deviation charts. Analytical methods would be used to match data variables since the methods are used where data outputs has literal peaks. The peak position and size are obtained by mathematical transformation of primary data with unlike form function. In this case the function may be continuously decreasing or increasing in function (Annor-Frempong & Duvel, 2011). To match data with analytical methods, first of all the data for several key events are imputed since the data analysis would be based on dissimilar events. The events should be included in the data file to produce consistent results. Secondly, the variables collected in the original questionnaire should be in convenient form for collection. The methods would therefore allow for combination in a form for easy analysis (Griffis & Cooper, 2010). Thirdly, variables arte summarized and included in the record file. The next step would involve calculation where certain indices are calculated and included in the record file. Finally, the data in the record file would be in a standardized format allowing for easy comparison and analysis.
Independent & Dependent Variables
In an experiment, the variable that varies and can be manipulated by the researcher is known as independent variable. It is the presumed cause and the antecedent of a research. In an experiment, this variable is controlled and manipulated. In a non experimental procedure where there is no manipulation, independent variable has an effect on dependent variable (Eriksson et al, 2006). For example, when a research is done on cigarette smoking verses lung cancer, smoking will be the independent variable because it can be manipulated. In other instances, the independent variable is always beyond manipulation and control such cases includes ethnicity and gender. Dependent variable on the other hand is the response that is measured. It is the presumed effect and consequent in a research carried out (Datta, 2010). The variable cannot be manipulated by the researcher instead it is measured or observed for variation as a deduced result of the variation in independent variable. Dependent variable is also the status of the outcome in which data contain.
Validity is an important aspect in data collection because without it, the research is meaningless with great deal of money, time and energy wasted. According to Duvel (1994), validity entails the meaningfulness, appropriateness and usefulness of inferences made from a test score. For findings to be appropriate, useful and meaningful, it needs to be valid. To have valid research findings, there is need for consideration of context and purpose of survey items to determine the appropriateness inferences (Duvel, 1994).
Tests lacking reliability will indicate ambiguous scores. A test should achieve a reasonable level of reliability in order to be precise. For example, a test with a score 80 may not be different than a score of 90 or 70 in terms of what students know (Ercan et al, 2007). Thus if the test is not reliable it cannot be valid.
When requirements in research are not made crystal clear, they are open to misinterpretation. Thus precision is necessary in data collection to limit misinterpretation. The method used in data collection or projects requirements does not matter, what matters are that they should not be too fuzzy (Neuman, 2003).
Data entry is the most important aspect in data collection, when they are added correctly the information collected helps execute plans correctly or helps achieve a reliable conclusion (Chan, 2004). It also helps in determining goals of the research and rectification of shortcomings encountered. Information collected with mistakes is skewed and inaccurate. Information entered should be accurate and efficient for a research to be precise, reliable and valid. Incorrect data in a database often has a negative effect in full value of research. For example, outdated information in an organization i.e. old appraisals and outdated past due rental notices will make a company appear to be making loses that it actually is. This would impact a business negatively in terms of future financing.