Página Inicial - Alea
          
Glossário
         
 

A  B  C  D  E  F  G  H  I  J  L  M  N  O  P  Q  R  S  T  U  V  X  Z

 

 

Definition

 
A

 
Aberrant value A value that stands apart from all others, giving the impression that it does not belong to the same dataset.  
Absolute frequency Number of elements belonging to a specified class.  
Accumulated absolute frequency The accumulated absolute frequency of index number i is the sum of the absolute frequencies of the variable values from the first such value to the i-th one.  
Accumulated relative frequency The accumulated relative frequency of index number i is the sum of the relative frequencies of the variable values from the first such value to the i-th one.  
Arithmetic mean The mean is the measure most used to locate the sample centre. It is obtained by adding together all of the sample elements and dividing the total by the sample size.  

[ TOP ]

 
   
B
 
   
Bar chart A graphical method consisting of marking points along the x-axis, of a system of coordinate axes, that represent the classes and then plotting vertical bars, equal in height or proportional to the absolute or relative frequency, at the respective point on the x-axis..  
Bivariate data A pair of values corresponding to a specific individual or experimental outcome.  
Bivariate distributions Statistical distributions in which two variables are analysed.  
Box-and-whisker plot It is a graphical method highlighting certain characteristics of the sample. The set of sample values located between the 1st and the 3rd quartiles, Q.25 and Q.75, is represented by a rectangle (box) with the median indicated by a bar. Two lines then join the rectangle's sides to the so-called adjacent values.  

[ TOP ]

 
C
 
Census The scientific study of a universe of people, institutions or physical objects in order to obtain knowledge, analysing all elements, and make quantitative inferences regarding important characteristics of that population.  
Certain event It is an event that has probability 1 to happen. The sample space is itself an event, which is considered a certain event.  
Class range The difference between the maximum and minimum value of the class.  
Coincidence Phenomena with uncertain individual results, but which possess long-term regularity, making it possible to obtain a general behaviour pattern.  
Complementary event The complementary event of event A is the event corresponding to all the results of the sample space S that are not in A.  
Contingency table A contingency table is a representation of either qualitative data or quantitative data, especially when it is related to bivariate data, which is, in other words, data that can be classified according to two criteria. In a contingency table the rows correspond to one of the criteria and the columns to the other.  
Continuous data Quantitative data that can take the form of all numerical values contained in the variation interval.  
Correlation coefficient A measure of the level of linear association between two variables.  
Cumulative function Function of the accumulated frequencies of a dataset.  

[ TOP ]

 
D
 
Descriptive statistics The descriptive study of the data of a sample (or a population), where the information contained in the dataset is summarised by means of tables, plots and calculating some characteristics of the dataset / statistics, in the case of a sample, or parameters, in the case of a population.  
Deterministic experiment A deterministic experiment is characterised by producing the same result, as long as it is repeated under the same conditions.  
Discrete data Quantitative data that can only take the form of a number of finite, or infinite (though numerable) different values.  
Disjointed events Disjointed events or mutually exclusive events are events in which the occurrence of one of them implies the non-occurrenceof the other one.  
Distribution with long tails The frequencies are distributed so that there are a large number of classes at the ends, with small frequencies compared to the central classes.  

[ TOP ]

 
E


 
Elementary event An event that corresponds to a single possible result of a random experiment.  
Empirical distribution function It is a function F(x) of all x values of R, which generates, for each x value, the proportion of elements of the sample that are less than or equal to x.  
Estimate The outcome of the estimator using a specific sample as the basis.  
Estimator This is a sample statistic (random variable), the specific values of which constitute estimates of the parameters in question. See Statistic (2).  
Event It is an element of the possible results of a random experiment, or, in other words, it is a subset of the sample space S.  
Extreme and quartile diagram It is a graphical representation highlighting certain characteristics of the sample. The set of sample values located between the 1st and the 3rd quartiles, Q.25 and Q.75 is represented by a rectangle (box) with the median indicated by a sign. Two lines then join the rectangle's sides to the maximum and minimum values, respectively.  

[ TOP ]

 
F
 
Frequency distribution See bar chart.  
Frequency polygon A line that links the ends of the bars of a Bar chart.  
Frequency table A table showing the distribution of the variable, in other words, the values or forms that the variable can take on, as well as the frequency with which those values occur.  

[ TOP ]

 
H
 
Histogram A histogram is a graphical plot of continuous data, formed of a succession of adjacent rectangles. Each rectangle refers to a class interval and the area of each one corresponds to the relative frequency (or absolute frequency). The total area of the histogram is, therefore, equal to 1 (respectively equal to n, the sample size).  

[ TOP ]

 
I
 
Impossible event This is the event that results from the intersection of disjointed or mutually exclusive events  
Interquartile range A measure of the variability of a sample, corresponding to the difference between the values of the third and first quartiles. This provides information on the range of the interval containing the middle 50% of the observations.  

[ TOP ]

 
L
 
Location measures Measures that locate and characterize the centre of a sample.  
  [ TOP ]  
M
 
Mean deviation This is the arithmetic mean of the absolute values of the deviation of each xi data value from the mean.  
Median It is a measure used to locate the data distribution centre, corresponding to the value that divides the sample in half, in other words half of the elements of the dataset are less than or equal to the median and the other half is greater than or equal to the median.  
Mill rate Proportion relative to one thousand.  
Modal class The value that occurs most frequently if the data are discrete, or the class interval with the greatest frequency if the data are continuous.  
Mode The value that occurs most frequently in a dataset, if the data are discrete, or the class interval with the greatest frequency if the data are continuous or grouped.  
  [ TOP ]  
P
 
Parameter It is a number that describes a characteristic of the population. Even though it is a fixed number, it is usually unknown. An unknown parameter can be estimated from a statistic (or estimator).  
Percentile See P-quantiles.  
Pictogram A graphical representation in which the data are represented by an image (or by a symbol) that is proportional to the frequency.  
Pie chart A graphic consisting of a circle divided into a number of sectors. The number of sectors is equal to the number of classes in the frequency table of the sample under analysis. The sector angles are proportional to the class frequency.  
Population A collection of individual units, which can be people or experiment results, that are to be the focus of study and have one or more common characteristics.  
Population increase The difference between population numbers at two different points in time. The population increase is calculated by the addition of the natural balance and migration balance.  
p-Quantile (Percentile and Quartile) The value Qp is known as the p-quantile, 0<p<1, or 100p% percentile when 100p% of the sample elements are equal to Qp and the remaining elements are greater than or equal to Qp. The 0.25 and 0.75 quantiles are respectively called the 1st and 3rd quartiles.  
Probability models Mathematical models used to describe random phenomena.  
Probability (frequency definition) The probability of an event A, represented by P(A) is defined as the value obtained for the relative frequency observed for A, over a large number of performances of the random experiment.  

[ TOP ]

 
Q
 
Qualitative data Data that represents the information concerning a quality, category or characteristic that cannot be measured but can be classified in various forms.  
Quantitative data Data that represents the information resulting from measurable characteristics, exhibited to different extents, that can be discrete - discrete data, or continuous - continuous data.  

[ TOP ]

 
R
 
Random experiment An experiment with the following characteristics: it can be repeatedly performed, in the same circumstances or in an independent manner, any time it is repeated; - the possible results are known; there is insufficient knowledge to know which result will be obtained from amongst the possible results when the experiment is performed or phenomenon observed.  
Random variable A random variable X is a function that associates a number to each point of the sample space S.  
Regression line It is the line that best fits the points of a scatter plot.  
Relative frequency The ratio between the number of elements belonging to a specified class and the total number of elements of the dataset under analysis.  

[ TOP ]

 
S
 
Sample A set of data or observations, collected from a subset of the population. Sometimes a sample is studied with the objective of drawing conclusions on the population from which it was collected.  
Sample range A measure of the variability of a sample, corresponding to the difference between the maximum and minimum value of the dataset.  
Sample size Number of elements of the sample.  
Sample space Is the set of individual results produced by a random experiment (or when random phenomena are analysed).  
Scatter plot The scatter plot is a graphical representation of bivariate data, in which each data pair (xi, yi) is represented by a coordinate point (xi, yi) in a system of coordinate axes.  
SP The scatter plot is a graphical representation of bivariate data in which each point is represented by two coordinates corresponding to the pair (xi, yi) on a system of coordinate axes.  
Skewed distribution A skewed distribution may be represented by a histogram with a frequency distribution that is markedly asymmetrical, containing values on one side that are substantially smaller than those on the other side of the distribution.  
Skewed sample A sample that does not correctly represent the whole population.  
Standard deviation A measure of the variability of a sample in relation to its mean, corresponding to the square root of the variance and it is expressed in the same units as the original data.  
Statistics (1) A discipline with the basic objective of collecting, compiling, analysing and interpreting data.  
Statistic (2) It is a number that describes the sample. The value of a statistic is calculated from the sample’s observed values. The statistic is used to estimate an unknown parameter.  
Statistical inference This is a fundamental phase of statistical analysis, during which, once certain properties are known (obtained via a descriptive analysis of the sample), expressed by means of propositions, more general statements are formulated, which express the existence of laws (relative to the population).  
Stem-and-leaf diagram Also known as the stemplot, it is a type of data representation that can be deemed to be halfway between a table and a graph, given that the true sample values are displayed, but in a presentation that brings to mind a histogram. It consists of writing the digit (or digits) of the largest class on the left-hand side of a vertical line, followed by all the others.  
Survey The scientific study of one part of a population for the purpose of studying attitudes, habits and preferences of the population with regard to events, circumstances and subjects of general interest.  
Symmetrical distribution A symmetrical distribution may be represented by a histogram with a frequency distribution that is more or less centred around a mean class.  

[ TOP ]

 
V
 
Variability measures Measures that indicates and describes the variability of a dataset.  
Variable A common characteristic of a population, possessing different values from one individual to the next.  
Variance A measure obtained by adding together all the squares of the deviations of data from the mean and dividing the total by the number of observations less one.  

[ TOP ]