Kezdi.KezdiModule

Kezdi.jl is a Julia package for data manipulation and analysis. It is inspired by Stata, but it is written in Julia, which makes it faster and more flexible. It is designed to be used in the Julia REPL, but it can also be used in Jupyter notebooks or in scripts.

source
Base.ismissingMethod
ismissing(args...) -> Bool

Return true if any of the arguments is missing.

source
Kezdi.condMethod
cond(x, y, z)

Return y if x is true, otherwise return z. If x is a vector, the operation is vectorized. This function mimics x ? y : z, which cannot be vectorized.

source
Kezdi.distinctMethod
distinct(x::AbstractVector) = unique(x)

Convenience function to get the distinct values of a vector.

source
Kezdi.getdfMethod
getdf() -> AbstractDataFrame

Return the global data frame.

source
Kezdi.keep_only_valuesMethod
keep_only_values(x::AbstractVector) -> AbstractVector

Return a vector with only the values of x, excluding any missingvalues,nothings,Infa andNaN`s.

source
Kezdi.rowcountMethod
rowcount(x::AbstractVector) = length(keep_only_values(x))

Count the number of valid values in a vector.

source
Kezdi.setdfMethod
setdf(df::Union{AbstractDataFrame, Nothing})

Set the global data frame.

source
Kezdi.@appendMacro
@append "filename.dta"

Append the data from the file filename.dta to the global data frame. Columns that are not common filled with missing values.

source
Kezdi.@collapseMacro
@collapse y1 = expr1 y2 = expr2 ... [@if condition], [by(group1, group2, ...)]

Collapse df by evaluating expressions expr1, expr2, etc. If condition is provided, the operation is executed only on rows for which the condition is true. If by is provided, the operation is executed by group.

source
Kezdi.@countMacro
@count [@if condition]

Count the number of rows for which the condition is true. If condition is not provided, the total number of rows is counted.

source
Kezdi.@describeMacro
@describe [y1] [y2]...

Show the names and data types of columns of the data frame. If no variable names given, all are shown.

source
Kezdi.@dropMacro
@drop y1 y2 ...

or @drop [@if condition]

Drop the variables y1, y2, etc. from df. If condition is provided, the rows for which the condition is true are dropped.

source
Kezdi.@egenMacro
@egen y1 = expr1 y2 = expr2 ... [@if condition], [by(group1, group2, ...)]

Generate new variables in df by evaluating expressions expr1, expr2, etc. If condition is provided, the operation is executed only on rows for which the condition is true. When the condition is false, the variables will be missing. If by is provided, the operation is executed by group.

source
Kezdi.@generateMacro
@generate y = expr [@if condition]

Create a new variable y in df by evaluating expr. If condition is provided, the operation is executed only on rows for which the condition is true. When the condition is false, the variable will be missing.

source
Kezdi.@headMacro
@head [n]

Display the first n rows of the data frame. By default, n is 5.

source
Kezdi.@keepMacro
@keep y1 y2 ... [@if condition]

Keep only the variables y1, y2, etc. in df. If condition is provided, only the rows for which the condition is true are kept.

source
Kezdi.@listMacro
@list [y1 y2...] [@if condition]

Display the entire data frame or the rows for which the condition is true. If variable names are provided, only the variables in the list are displayed.

source
Kezdi.@mvencodeMacro
@mvencode y1 y2 [_all] ... [if condition], [mv(value)]

Encode missing values in the variables y1, y2, etc. in the data frame. If condition is provided, the operation is executed only on rows for which the condition is true. If mv is provided, the missing values are encoded with the value value. By default value is missing making no changes on the dataframe. Using _all encodes all varibles of the DataFrame.

source
Kezdi.@orderMacro
@order y1 y2 ... , [desc] [last] [after=var] [before=var] [alphabetical]

Reorder the variables y1, y2, etc. in the data frame. By default, the variables are ordered in the order they are listed. If desc is provided, the variables are ordered in descending order. If last is provided, the variables are moved to the end of the data frame. If after is provided, the variables are moved after the variable var. If before is provided, the variables are moved before the variable var. If alphabetical is provided, the variables are ordered alphabetically.

source
Kezdi.@regressMacro
@regress y x1 x2 ... [@if condition], [robust] [cluster(var1, var2, ...)]

Estimate a regression model in df with dependent variable y and independent variables x1, x2, etc. If condition is provided, the operation is executed only on rows for which the condition is true. If robust is provided, robust standard errors are calculated. If cluster is provided, clustered standard errors are calculated.

The regression is limited to rows for which all variables are values. Missing values, infinity, and NaN are automatically excluded.

source
Kezdi.@renameMacro
@rename oldname newname

Rename the variable oldname to newname in the data frame.

source
Kezdi.@replaceMacro
@replace y = expr [@if condition]

Replace the values of y in df with the result of evaluating expr. If condition is provided, the operation is executed only on rows for which the condition is true. When the condition is false, the variable will be left unchanged.

source
Kezdi.@saveMacro
@save "filename.dta", [replace]

Save the global data frame to the file filename.dta. If the file already exists, the replace option must be provided.

source
Kezdi.@sortMacro
@sort y1 y2 ... , [desc]

Sort the data frame by the variables y1, y2, etc. By default, the variables are sorted in ascending order. If desc is provided, the variables are sorted in descending order

source
Kezdi.@summarizeMacro
@summarize y [@if condition]

Summarize the variable y in df. If condition is provided, the operation is executed only on rows for which the condition is true.

source
Kezdi.@tabulateMacro
@tabulate y1 y2 ... [@if condition]

Create a frequency table for the variables y1, y2, etc. in df. If condition is provided, the operation is executed only on rows for which the condition is true.

source
Kezdi.@tailMacro
@tail [n]

Display the last n rows of the data frame. By default, n is 5.

source
Kezdi.@useMacro
@use "filename.dta", [clear]

Read the data from the file filename.dta and set it as the global data frame. If there is already a global data frame, @use will throw an error unless the clear option is provided

source