Kezdi.Kezdi
Base.ismissing
Base.ismissing
Kezdi.cond
Kezdi.cond
Kezdi.distinct
Kezdi.distinct
Kezdi.getdf
Kezdi.getdf
Kezdi.keep_only_values
Kezdi.keep_only_values
Kezdi.mvreplace
Kezdi.rowcount
Kezdi.rowcount
Kezdi.setdf
Kezdi.setdf
Kezdi.@append
Kezdi.@append
Kezdi.@clear
Kezdi.@clear
Kezdi.@collapse
Kezdi.@collapse
Kezdi.@count
Kezdi.@count
Kezdi.@describe
Kezdi.@describe
Kezdi.@drop
Kezdi.@drop
Kezdi.@egen
Kezdi.@egen
Kezdi.@generate
Kezdi.@generate
Kezdi.@head
Kezdi.@head
Kezdi.@keep
Kezdi.@keep
Kezdi.@list
Kezdi.@list
Kezdi.@mvencode
Kezdi.@mvencode
Kezdi.@names
Kezdi.@names
Kezdi.@order
Kezdi.@order
Kezdi.@regress
Kezdi.@regress
Kezdi.@rename
Kezdi.@rename
Kezdi.@replace
Kezdi.@replace
Kezdi.@reshape
Kezdi.@reshape
Kezdi.@save
Kezdi.@save
Kezdi.@sort
Kezdi.@sort
Kezdi.@summarize
Kezdi.@summarize
Kezdi.@tabulate
Kezdi.@tabulate
Kezdi.@tail
Kezdi.@tail
Kezdi.@use
Kezdi.@use
Kezdi.With.@with
Kezdi.With.@with!
Kezdi.Kezdi
— ModuleKezdi.jl is a Julia package for data manipulation and analysis. It is inspired by Stata, but it is written in Julia, which makes it faster and more flexible. It is designed to be used in the Julia REPL, but it can also be used in Jupyter notebooks or in scripts.
Base.ismissing
— Methodismissing(args...) -> Bool
Return true
if any of the arguments is missing
.
Kezdi.cond
— Methodcond(x, y, z)
Return y
if x
is true
, otherwise return z
. If x
is a vector, the operation is vectorized. This function mimics x ? y : z
, which cannot be vectorized.
Kezdi.distinct
— Methoddistinct(x::AbstractVector) = unique(x)
Convenience function to get the distinct values of a vector.
Kezdi.getdf
— Methodgetdf() -> AbstractDataFrame
Return the global data frame.
Kezdi.keep_only_values
— Methodkeep_only_values(x::AbstractVector) -> AbstractVector
Return a vector with only the values of x
, excluding any missing
values,
nothings,
Infa and
NaN`s.
Kezdi.mvreplace
— Methodmvreplace(x, y)
Return y
if x
is missing
, otherwise return x
. If x
is a vector, the operation is vectorized. This function mimics x ? y : z
, which cannot be vectorized.
Kezdi.rowcount
— Methodrowcount(x::AbstractVector) = length(keep_only_values(x))
Count the number of valid values in a vector.
Kezdi.setdf
— Methodsetdf(df::Union{AbstractDataFrame, Nothing})
Set the global data frame.
Kezdi.@append
— Macro@append "filename.dta" / @append df
Append the data from the file filename.dta
or df
DataFrame to the global data frame. Columns that are not common filled with missing values.
Kezdi.@clear
— Macro@clear
Clears the global dataframe.
Kezdi.@collapse
— Macro@collapse y1 = expr1 y2 = expr2 ... [@if condition], [by(group1, group2, ...)]
Collapse df
by evaluating expressions expr1
, expr2
, etc. If condition
is provided, the operation is executed only on rows for which the condition is true. If by
is provided, the operation is executed by group.
Kezdi.@count
— Macro@count [@if condition]
Count the number of rows for which the condition is true. If condition
is not provided, the total number of rows is counted.
Kezdi.@describe
— Macro@describe [y1] [y2]...
Show the names and data types of columns of the data frame. If no variable names given, all are shown.
Kezdi.@drop
— Macro@drop y1 y2 ...
or @drop [@if condition]
Drop the variables y1
, y2
, etc. from df
. If condition
is provided, the rows for which the condition is true are dropped.
Kezdi.@egen
— Macro@egen y1 = expr1 y2 = expr2 ... [@if condition], [by(group1, group2, ...)]
Generate new variables in df
by evaluating expressions expr1
, expr2
, etc. If condition
is provided, the operation is executed only on rows for which the condition is true. When the condition is false, the variables will be missing. If by
is provided, the operation is executed by group.
Kezdi.@generate
— Macro@generate y = expr [@if condition]
Create a new variable y
in df
by evaluating expr
. If condition
is provided, the operation is executed only on rows for which the condition is true. When the condition is false, the variable will be missing.
Kezdi.@head
— Macro@head [n]
Display the first n
rows of the data frame. By default, n
is 5.
Kezdi.@keep
— Macro@keep y1 y2 ... [@if condition]
Keep only the variables y1
, y2
, etc. in df
. If condition
is provided, only the rows for which the condition is true are kept.
Kezdi.@list
— Macro@list [y1 y2...] [@if condition]
Display the entire data frame or the rows for which the condition is true. If variable names are provided, only the variables in the list are displayed.
Kezdi.@mvencode
— Macro@mvencode y1 y2 [_all] ... [if condition], [mv(value)]
Encode missing values in the variables y1
, y2
, etc. in the data frame. If condition
is provided, the operation is executed only on rows for which the condition is true. If mv
is provided, the missing values are encoded with the value value
. By default value is missing
making no changes on the dataframe. Using _all
encodes all variables of the DataFrame.
Kezdi.@names
— Macro@names
Display the names of the variables in the data frame.
Kezdi.@order
— Macro@order y1 y2 ... , [desc] [last] [after=var] [before=var] [alphabetical]
Reorder the variables y1
, y2
, etc. in the data frame. By default, the variables are ordered in the order they are listed. If desc
is provided, the variables are ordered in descending order. If last
is provided, the variables are moved to the end of the data frame. If after
is provided, the variables are moved after the variable var
. If before
is provided, the variables are moved before the variable var
. If alphabetical
is provided, the variables are ordered alphabetically.
Kezdi.@regress
— Macro@regress y x1 x2 ... [@if condition], [robust] [cluster(var1, var2, ...)]
Estimate a regression model in df
with dependent variable y
and independent variables x1
, x2
, etc. If condition
is provided, the operation is executed only on rows for which the condition is true. If robust
is provided, robust standard errors are calculated. If cluster
is provided, clustered standard errors are calculated.
The regression is limited to rows for which all variables are values. Missing values, infinity, and NaN are automatically excluded.
Kezdi.@rename
— Macro@rename oldname newname
Rename the variable oldname
to newname
in the data frame.
Kezdi.@replace
— Macro@replace y = expr [@if condition]
Replace the values of y
in df
with the result of evaluating expr
. If condition
is provided, the operation is executed only on rows for which the condition is true. When the condition is false, the variable will be left unchanged.
Kezdi.@reshape
— Macro@reshape long y1 y2 ... i(varlist) j(var)
@reshape wide y1 y2 ... i(varlist) j(var)
Reshape the data frame from wide to long or from long to wide format. The variables y1
, y2
, etc. are the variables to be reshaped. The i(var)
and j(var)
are the variables that define the row and column indices in the reshaped data frame.
The option i()
may include multiple variables, like i(var1, var2, var3)
. The option j()
must include only one variable.
Kezdi.@save
— Macro@save "filename.dta", [replace]
Save the global data frame to the file filename.dta
. If the file already exists, the replace
option must be provided.
Kezdi.@sort
— Macro@sort y1 y2 ... , [desc]
Sort the data frame by the variables y1
, y2
, etc. By default, the variables are sorted in ascending order. If desc
is provided, the variables are sorted in descending order
Kezdi.@summarize
— Macro@summarize y [@if condition]
Summarize the variable y
in df
. If condition
is provided, the operation is executed only on rows for which the condition is true.
Kezdi.@tabulate
— Macro@tabulate y1 y2 ... [@if condition]
Create a frequency table for the variables y1
, y2
, etc. in df
. If condition
is provided, the operation is executed only on rows for which the condition is true.
Kezdi.@tail
— Macro@tail [n]
Display the last n
rows of the data frame. By default, n
is 5.
Kezdi.@use
— Macro@use "filename.dta", [clear]
Read the data from the file filename.dta
and set it as the global data frame. If there is already a global data frame, @use
will throw an error unless the clear
option is provided