Developer guide: parameters from core library
The XGBoost core library accepts a long list of input parameters (e.g. max_depth
for decision trees, regularization, device
where compute happens, etc.). New parameters are constantly being added as XGBoost is developed further, and their language bindings should allow passing to the core library everything that it accepts.
In the case of R, these parameters are passed as an R list
object to function xgb.train
, but the R interface aims at providing a better, more idiomatic user experience by offering a parameters constructor with full in-package documentation. This requires keeping the list of parameters and their documentation up to date in the R package too, in addition to the general online documentation for XGBoost.
In more detail, there is a function xgb.params
which allows the user to construct such a list
object to pass to xgb.train
while getting full IDE autocompletion on it. This function should accept all possible XGBoost parameters as arguments, listing them in the same order as they appear in the online documentation.
In order to add a new parameter from the core library to xgb.params
:
Add the parameter at the right location, according to the order in which it appears in the .rst file listing the parameters for the core library. If the parameter appears more than once (e.g. because it applies to more than one type of booster), then add it in a position according to to the first occurrence.
Copy-paste the docs from the .rst file as another
@param
entry forxgb.train
. Some easy substitutions might be needed, such as changing double-backticks to single-backticks, enquoting variables that need to be passed as strings, and replacing:math:
calls with their roxygen equivalent\eqn{}
, among others.If needed, make minimal modifications for the R interface - for example, since parameters are only listed once, should add at the beginning a note about which type of booster they apply to if they are only applicable for one type, or list default values by booster type if they are different.
After adding the parameter to xgb.params
, it will also need to be added to the function xgboost
if that function can use it. The function xgboost
is not meant to support everything that the core library offers - currently parameters related to learning-to-rank are not listed there for example as they are unusable for it (but can be used for xgb.train
).
In order to add the parameter to xgboost
:
Add it to the function signature. The position here differs though: there are a few selected parameters whose positions have been moved closer to the top of the signature. New parameters should not be placed within those “top” positions - instead, place it after parameter
tree_method
, in the most similar place among the remaining parameters according to how it was inserted inxgb.params
. Note that the rest of the parameters that come aftertree_method
are still meant to follow the same relative order as inxgb.params
.If the parameter applies exactly in the same way as in
xgb.train
, then no additional documentation is needed forxgboost
, because it inherits parameters fromxgb.params
by default. However, some parameters might need slight modifications - for example, not all objectives are supported byxgboost
, so modifications are needed for that parameter.If the parameter allows aliases, use only one alias, and prefer the most descriptive nomenclature (e.g. “learning_rate” instead of “eta”). These also need a doc entry
@param
inxgboost
, as the one inxgb.params
will have the unsupported alias.
As new objectives and evaluation metrics are added, be mindful that they need to be added to the docs of both xgb.params
and xgboost
. Documentation for objectives in both functions was originally copied from the same .rst file for the core library, but for xgboost
it undergoes additional modifications in order to list what is and isn’t supported, and to refer only to the parameter aliases that are accepted by xgboost
.
Keep in mind also that objectives that are a variant of one another but with a different prediction mode, are not meant to be allowed in xgboost
as they’d break its intended interface - therefore, such objectives are not described in the docs for xgboost
(but there is a list at the end of what isn’t supported by it) and are checked against in function prescreen.objective
.