class XGBoostClassifier extends ProbabilisticClassifier[Vector, XGBoostClassifier, XGBoostClassificationModel] with XGBoostClassifierParams with DefaultParamsWritable
- Alphabetic
- By Inheritance
- XGBoostClassifier
- DefaultParamsWritable
- MLWritable
- XGBoostClassifierParams
- XGBoostEstimatorCommon
- NonParamVariables
- ParamMapFuncs
- RabitParams
- BoosterParams
- LearningTaskParams
- GeneralParams
- HasContribPredictionCol
- HasLeafPredictionCol
- HasNumClass
- HasBaseMarginCol
- HasWeightCol
- ProbabilisticClassifier
- ProbabilisticClassifierParams
- HasThresholds
- HasProbabilityCol
- Classifier
- ClassifierParams
- HasRawPredictionCol
- Predictor
- PredictorParams
- HasPredictionCol
- HasFeaturesCol
- HasLabelCol
- Estimator
- PipelineStage
- Logging
- Params
- Serializable
- Serializable
- Identifiable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
$[T](param: Param[T]): T
- Attributes
- protected
- Definition Classes
- Params
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
MLlib2XGBoostParams: Map[String, Any]
- Definition Classes
- ParamMapFuncs
-
def
XGBoost2MLlibParams(xgboostParams: Map[String, Any]): Unit
- Definition Classes
- ParamMapFuncs
-
final
val
allowNonZeroForMissing: BooleanParam
Allows for having a non-zero value for missing when training on prediction on a Sparse or Empty vector.
Allows for having a non-zero value for missing when training on prediction on a Sparse or Empty vector.
- Definition Classes
- GeneralParams
-
final
val
alpha: DoubleParam
L1 regularization term on weights, increase this value will make model more conservative.
L1 regularization term on weights, increase this value will make model more conservative. [default=0]
- Definition Classes
- BoosterParams
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
final
val
baseMarginCol: Param[String]
Param for initial prediction (aka base margin) column name.
Param for initial prediction (aka base margin) column name.
- Definition Classes
- HasBaseMarginCol
-
final
val
baseScore: DoubleParam
the initial prediction score of all instances, global bias.
the initial prediction score of all instances, global bias. default=0.5
- Definition Classes
- LearningTaskParams
-
final
val
cacheTrainingSet: BooleanParam
whether caching training data
whether caching training data
- Definition Classes
- LearningTaskParams
-
final
val
checkpointInterval: IntParam
Param for set checkpoint interval (>= 1) or disable checkpoint (-1).
Param for set checkpoint interval (>= 1) or disable checkpoint (-1). E.g. 10 means that the trained model will get checkpointed every 10 iterations. Note:
checkpoint_path
must also be set if the checkpoint interval is greater than 0.- Definition Classes
- GeneralParams
-
final
val
checkpointPath: Param[String]
The hdfs folder to load and save checkpoint boosters.
The hdfs folder to load and save checkpoint boosters. default:
empty_string
- Definition Classes
- GeneralParams
-
final
def
clear(param: Param[_]): XGBoostClassifier.this.type
- Definition Classes
- Params
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
final
val
colsampleBylevel: DoubleParam
subsample ratio of columns for each split, in each level.
subsample ratio of columns for each split, in each level. [default=1] range: (0,1]
- Definition Classes
- BoosterParams
-
final
val
colsampleBytree: DoubleParam
subsample ratio of columns when constructing each tree.
subsample ratio of columns when constructing each tree. [default=1] range: (0,1]
- Definition Classes
- BoosterParams
-
final
val
contribPredictionCol: Param[String]
Param for contribution prediction column name.
Param for contribution prediction column name.
- Definition Classes
- HasContribPredictionCol
-
def
copy(extra: ParamMap): XGBoostClassifier
- Definition Classes
- XGBoostClassifier → Predictor → Estimator → PipelineStage → Params
-
def
copyValues[T <: Params](to: T, extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
final
val
customEval: CustomEvalParam
customized evaluation function provided by user.
customized evaluation function provided by user. default: null
- Definition Classes
- GeneralParams
-
final
val
customObj: CustomObjParam
customized objective function provided by user.
customized objective function provided by user. default: null
- Definition Classes
- GeneralParams
-
final
def
defaultCopy[T <: Params](extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
val
eta: DoubleParam
step size shrinkage used in update to prevents overfitting.
step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features and eta actually shrinks the feature weights to make the boosting process more conservative. [default=0.3] range: [0,1]
- Definition Classes
- BoosterParams
-
final
val
evalMetric: Param[String]
evaluation metrics for validation data, a default metric will be assigned according to objective(rmse for regression, and error for classification, mean average precision for ranking).
evaluation metrics for validation data, a default metric will be assigned according to objective(rmse for regression, and error for classification, mean average precision for ranking). options: rmse, rmsle, mae, logloss, error, merror, mlogloss, auc, aucpr, ndcg, map, gamma-deviance
- Definition Classes
- LearningTaskParams
-
val
evalSetsMap: Map[String, DataFrame]
- Attributes
- protected
- Definition Classes
- NonParamVariables
-
def
explainParam(param: Param[_]): String
- Definition Classes
- Params
-
def
explainParams(): String
- Definition Classes
- Params
-
def
extractLabeledPoints(dataset: Dataset[_], numClasses: Int): RDD[org.apache.spark.ml.feature.LabeledPoint]
- Attributes
- protected
- Definition Classes
- Classifier
-
def
extractLabeledPoints(dataset: Dataset[_]): RDD[org.apache.spark.ml.feature.LabeledPoint]
- Attributes
- protected
- Definition Classes
- Predictor
-
final
def
extractParamMap(): ParamMap
- Definition Classes
- Params
-
final
def
extractParamMap(extra: ParamMap): ParamMap
- Definition Classes
- Params
-
final
val
featuresCol: Param[String]
- Definition Classes
- HasFeaturesCol
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
fit(dataset: Dataset[_]): XGBoostClassificationModel
- Definition Classes
- Predictor → Estimator
-
def
fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[XGBoostClassificationModel]
- Definition Classes
- Estimator
- Annotations
- @Since( "2.0.0" )
-
def
fit(dataset: Dataset[_], paramMap: ParamMap): XGBoostClassificationModel
- Definition Classes
- Estimator
- Annotations
- @Since( "2.0.0" )
-
def
fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): XGBoostClassificationModel
- Definition Classes
- Estimator
- Annotations
- @Since( "2.0.0" ) @varargs()
-
final
val
gamma: DoubleParam
minimum loss reduction required to make a further partition on a leaf node of the tree.
minimum loss reduction required to make a further partition on a leaf node of the tree. the larger, the more conservative the algorithm will be. [default=0] range: [0, Double.MaxValue]
- Definition Classes
- BoosterParams
-
final
def
get[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
final
def
getAllowNonZeroForMissingValue: Boolean
- Definition Classes
- GeneralParams
-
final
def
getAlpha: Double
- Definition Classes
- BoosterParams
-
final
def
getBaseMarginCol: String
- Definition Classes
- HasBaseMarginCol
-
final
def
getBaseScore: Double
- Definition Classes
- LearningTaskParams
-
final
def
getCheckpointInterval: Int
- Definition Classes
- GeneralParams
-
final
def
getCheckpointPath: String
- Definition Classes
- GeneralParams
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
getColsampleBylevel: Double
- Definition Classes
- BoosterParams
-
final
def
getColsampleBytree: Double
- Definition Classes
- BoosterParams
-
final
def
getContribPredictionCol: String
- Definition Classes
- HasContribPredictionCol
-
final
def
getDefault[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
final
def
getEta: Double
- Definition Classes
- BoosterParams
-
final
def
getEvalMetric: String
- Definition Classes
- LearningTaskParams
-
def
getEvalSets(params: Map[String, Any]): Map[String, DataFrame]
- Definition Classes
- NonParamVariables
-
final
def
getFeaturesCol: String
- Definition Classes
- HasFeaturesCol
-
final
def
getGamma: Double
- Definition Classes
- BoosterParams
-
final
def
getGrowPolicy: String
- Definition Classes
- BoosterParams
-
final
def
getInteractionConstraints: String
- Definition Classes
- BoosterParams
-
final
def
getLabelCol: String
- Definition Classes
- HasLabelCol
-
final
def
getLambda: Double
- Definition Classes
- BoosterParams
-
final
def
getLambdaBias: Double
- Definition Classes
- BoosterParams
-
final
def
getLeafPredictionCol: String
- Definition Classes
- HasLeafPredictionCol
-
final
def
getMaxBins: Int
- Definition Classes
- BoosterParams
-
final
def
getMaxDeltaStep: Double
- Definition Classes
- BoosterParams
-
final
def
getMaxDepth: Int
- Definition Classes
- BoosterParams
-
final
def
getMaxLeaves: Int
- Definition Classes
- BoosterParams
-
final
def
getMaximizeEvaluationMetrics: Boolean
- Definition Classes
- LearningTaskParams
-
final
def
getMinChildWeight: Double
- Definition Classes
- BoosterParams
-
final
def
getMissing: Float
- Definition Classes
- GeneralParams
-
final
def
getMonotoneConstraints: String
- Definition Classes
- BoosterParams
-
final
def
getNormalizeType: String
- Definition Classes
- BoosterParams
-
final
def
getNthread: Int
- Definition Classes
- GeneralParams
-
final
def
getNumClass: Int
- Definition Classes
- HasNumClass
-
def
getNumClasses(dataset: Dataset[_], maxNumClasses: Int): Int
- Attributes
- protected
- Definition Classes
- Classifier
-
final
def
getNumEarlyStoppingRounds: Int
- Definition Classes
- LearningTaskParams
-
final
def
getNumRound: Int
- Definition Classes
- GeneralParams
-
final
def
getNumWorkers: Int
- Definition Classes
- GeneralParams
-
final
def
getObjective: String
- Definition Classes
- LearningTaskParams
-
final
def
getObjectiveType: String
- Definition Classes
- LearningTaskParams
-
final
def
getOrDefault[T](param: Param[T]): T
- Definition Classes
- Params
-
def
getParam(paramName: String): Param[Any]
- Definition Classes
- Params
-
final
def
getPredictionCol: String
- Definition Classes
- HasPredictionCol
-
final
def
getProbabilityCol: String
- Definition Classes
- HasProbabilityCol
-
final
def
getRateDrop: Double
- Definition Classes
- BoosterParams
-
final
def
getRawPredictionCol: String
- Definition Classes
- HasRawPredictionCol
-
final
def
getSampleType: String
- Definition Classes
- BoosterParams
-
final
def
getScalePosWeight: Double
- Definition Classes
- BoosterParams
-
final
def
getSeed: Long
- Definition Classes
- GeneralParams
-
final
def
getSilent: Int
- Definition Classes
- GeneralParams
-
final
def
getSketchEps: Double
- Definition Classes
- BoosterParams
-
final
def
getSkipDrop: Double
- Definition Classes
- BoosterParams
-
final
def
getSubsample: Double
- Definition Classes
- BoosterParams
-
def
getThresholds: Array[Double]
- Definition Classes
- HasThresholds
-
final
def
getTimeoutRequestWorkers: Long
- Definition Classes
- GeneralParams
-
final
def
getTrainTestRatio: Double
- Definition Classes
- LearningTaskParams
-
final
def
getTreeLimit: Int
- Definition Classes
- BoosterParams
-
final
def
getTreeMethod: String
- Definition Classes
- BoosterParams
-
final
def
getUseExternalMemory: Boolean
- Definition Classes
- GeneralParams
-
final
def
getVerbosity: Int
- Definition Classes
- GeneralParams
-
final
def
getWeightCol: String
- Definition Classes
- HasWeightCol
-
final
val
growPolicy: Param[String]
growth policy for fast histogram algorithm
growth policy for fast histogram algorithm
- Definition Classes
- BoosterParams
-
final
def
hasDefault[T](param: Param[T]): Boolean
- Definition Classes
- Params
-
def
hasParam(paramName: String): Boolean
- Definition Classes
- Params
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
val
interactionConstraints: Param[String]
- Definition Classes
- BoosterParams
-
final
def
isDefined(param: Param[_]): Boolean
- Definition Classes
- Params
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
isSet(param: Param[_]): Boolean
- Definition Classes
- Params
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
final
val
labelCol: Param[String]
- Definition Classes
- HasLabelCol
-
final
val
lambda: DoubleParam
L2 regularization term on weights, increase this value will make model more conservative.
L2 regularization term on weights, increase this value will make model more conservative. [default=1]
- Definition Classes
- BoosterParams
-
final
val
lambdaBias: DoubleParam
Parameter of linear booster L2 regularization term on bias, default 0(no L1 reg on bias because it is not important)
Parameter of linear booster L2 regularization term on bias, default 0(no L1 reg on bias because it is not important)
- Definition Classes
- BoosterParams
-
final
val
leafPredictionCol: Param[String]
Param for leaf prediction column name.
Param for leaf prediction column name.
- Definition Classes
- HasLeafPredictionCol
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
val
maxBins: IntParam
maximum number of bins in histogram
maximum number of bins in histogram
- Definition Classes
- BoosterParams
-
final
val
maxDeltaStep: DoubleParam
Maximum delta step we allow each tree's weight estimation to be.
Maximum delta step we allow each tree's weight estimation to be. If the value is set to 0, it means there is no constraint. If it is set to a positive value, it can help making the update step more conservative. Usually this parameter is not needed, but it might help in logistic regression when class is extremely imbalanced. Set it to value of 1-10 might help control the update. [default=0] range: [0, Double.MaxValue]
- Definition Classes
- BoosterParams
-
final
val
maxDepth: IntParam
maximum depth of a tree, increase this value will make model more complex / likely to be overfitting.
maximum depth of a tree, increase this value will make model more complex / likely to be overfitting. [default=6] range: [1, Int.MaxValue]
- Definition Classes
- BoosterParams
-
final
val
maxLeaves: IntParam
Maximum number of nodes to be added.
Maximum number of nodes to be added. Only relevant when grow_policy=lossguide is set.
- Definition Classes
- BoosterParams
-
final
val
maximizeEvaluationMetrics: BooleanParam
- Definition Classes
- LearningTaskParams
-
final
val
minChildWeight: DoubleParam
minimum sum of instance weight(hessian) needed in a child.
minimum sum of instance weight(hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression mode, this simply corresponds to minimum number of instances needed to be in each node. The larger, the more conservative the algorithm will be. [default=1] range: [0, Double.MaxValue]
- Definition Classes
- BoosterParams
-
final
val
missing: FloatParam
the value treated as missing.
the value treated as missing. default: Float.NaN
- Definition Classes
- GeneralParams
-
final
val
monotoneConstraints: Param[String]
- Definition Classes
- BoosterParams
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
needDeterministicRepartitioning: Boolean
- Definition Classes
- XGBoostEstimatorCommon
-
final
val
normalizeType: Param[String]
Parameter of Dart booster.
Parameter of Dart booster. type of normalization algorithm, options: {'tree', 'forest'}. [default="tree"]
- Definition Classes
- BoosterParams
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
val
nthread: IntParam
number of threads used by per worker.
number of threads used by per worker. default 1
- Definition Classes
- GeneralParams
-
final
val
numClass: IntParam
number of classes
number of classes
- Definition Classes
- HasNumClass
-
final
val
numEarlyStoppingRounds: IntParam
If non-zero, the training will be stopped after a specified number of consecutive increases in any evaluation metric.
If non-zero, the training will be stopped after a specified number of consecutive increases in any evaluation metric.
- Definition Classes
- LearningTaskParams
-
final
val
numRound: IntParam
The number of rounds for boosting
The number of rounds for boosting
- Definition Classes
- GeneralParams
-
final
val
numWorkers: IntParam
number of workers used to train xgboost model.
number of workers used to train xgboost model. default: 1
- Definition Classes
- GeneralParams
-
final
val
objective: Param[String]
Specify the learning task and the corresponding learning objective.
Specify the learning task and the corresponding learning objective. options: reg:squarederror, reg:squaredlogerror, reg:logistic, binary:logistic, binary:logitraw, count:poisson, multi:softmax, multi:softprob, rank:pairwise, reg:gamma. default: reg:squarederror
- Definition Classes
- LearningTaskParams
-
final
val
objectiveType: Param[String]
The learning objective type of the specified custom objective and eval.
The learning objective type of the specified custom objective and eval. Corresponding type will be assigned if custom objective is defined options: regression, classification. default: null
- Definition Classes
- LearningTaskParams
-
lazy val
params: Array[Param[_]]
- Definition Classes
- Params
-
final
val
predictionCol: Param[String]
- Definition Classes
- HasPredictionCol
-
final
val
probabilityCol: Param[String]
- Definition Classes
- HasProbabilityCol
-
final
def
rabitConnectRetry: IntParam
- Definition Classes
- RabitParams
-
final
val
rabitRingReduceThreshold: IntParam
Rabit parameters passed through Rabit.Init into native layer rabit_ring_reduce_threshold - minimal threshold to enable ring based allreduce operation rabit_timeout - wait interval before exit after rabit observed failures set -1 to disable dmlc_worker_connect_retry - number of retrys to tracker dmlc_worker_stop_process_on_error - exit process when rabit see assert/error
Rabit parameters passed through Rabit.Init into native layer rabit_ring_reduce_threshold - minimal threshold to enable ring based allreduce operation rabit_timeout - wait interval before exit after rabit observed failures set -1 to disable dmlc_worker_connect_retry - number of retrys to tracker dmlc_worker_stop_process_on_error - exit process when rabit see assert/error
- Definition Classes
- RabitParams
-
final
def
rabitTimeout: IntParam
- Definition Classes
- RabitParams
-
final
val
rateDrop: DoubleParam
Parameter of Dart booster.
Parameter of Dart booster. dropout rate. [default=0.0] range: [0.0, 1.0]
- Definition Classes
- BoosterParams
-
final
val
rawPredictionCol: Param[String]
- Definition Classes
- HasRawPredictionCol
-
final
val
sampleType: Param[String]
Parameter for Dart booster.
Parameter for Dart booster. Type of sampling algorithm. "uniform": dropped trees are selected uniformly. "weighted": dropped trees are selected in proportion to weight. [default="uniform"]
- Definition Classes
- BoosterParams
-
def
save(path: String): Unit
- Definition Classes
- MLWritable
- Annotations
- @Since( "1.6.0" ) @throws( ... )
-
final
val
scalePosWeight: DoubleParam
Control the balance of positive and negative weights, useful for unbalanced classes.
Control the balance of positive and negative weights, useful for unbalanced classes. A typical value to consider: sum(negative cases) / sum(positive cases). [default=1]
- Definition Classes
- BoosterParams
-
final
val
seed: LongParam
Random seed for the C++ part of XGBoost and train/test splitting.
Random seed for the C++ part of XGBoost and train/test splitting.
- Definition Classes
- GeneralParams
-
final
def
set(paramPair: ParamPair[_]): XGBoostClassifier.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set(param: String, value: Any): XGBoostClassifier.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set[T](param: Param[T], value: T): XGBoostClassifier.this.type
- Definition Classes
- Params
- def setAlpha(value: Double): XGBoostClassifier.this.type
- def setBaseMarginCol(value: String): XGBoostClassifier.this.type
- def setBaseScore(value: Double): XGBoostClassifier.this.type
- def setCheckpointInterval(value: Int): XGBoostClassifier.this.type
- def setCheckpointPath(value: String): XGBoostClassifier.this.type
- def setColsampleBylevel(value: Double): XGBoostClassifier.this.type
- def setColsampleBytree(value: Double): XGBoostClassifier.this.type
- def setCustomEval(value: EvalTrait): XGBoostClassifier.this.type
- def setCustomObj(value: ObjectiveTrait): XGBoostClassifier.this.type
-
final
def
setDefault(paramPairs: ParamPair[_]*): XGBoostClassifier.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
setDefault[T](param: Param[T], value: T): XGBoostClassifier.this.type
- Attributes
- protected
- Definition Classes
- Params
- def setEta(value: Double): XGBoostClassifier.this.type
- def setEvalMetric(value: String): XGBoostClassifier.this.type
-
def
setEvalSets(evalSets: Map[String, DataFrame]): XGBoostClassifier.this.type
- Definition Classes
- NonParamVariables
-
def
setFeaturesCol(value: String): XGBoostClassifier
- Definition Classes
- Predictor
- def setGamma(value: Double): XGBoostClassifier.this.type
- def setGrowPolicy(value: String): XGBoostClassifier.this.type
-
def
setLabelCol(value: String): XGBoostClassifier
- Definition Classes
- Predictor
- def setLambda(value: Double): XGBoostClassifier.this.type
- def setLambdaBias(value: Double): XGBoostClassifier.this.type
- def setMaxBins(value: Int): XGBoostClassifier.this.type
- def setMaxDeltaStep(value: Double): XGBoostClassifier.this.type
- def setMaxDepth(value: Int): XGBoostClassifier.this.type
- def setMaxLeaves(value: Int): XGBoostClassifier.this.type
- def setMaximizeEvaluationMetrics(value: Boolean): XGBoostClassifier.this.type
- def setMinChildWeight(value: Double): XGBoostClassifier.this.type
- def setMissing(value: Float): XGBoostClassifier.this.type
- def setNormalizeType(value: String): XGBoostClassifier.this.type
- def setNthread(value: Int): XGBoostClassifier.this.type
- def setNumClass(value: Int): XGBoostClassifier.this.type
- def setNumEarlyStoppingRounds(value: Int): XGBoostClassifier.this.type
- def setNumRound(value: Int): XGBoostClassifier.this.type
- def setNumWorkers(value: Int): XGBoostClassifier.this.type
- def setObjective(value: String): XGBoostClassifier.this.type
- def setObjectiveType(value: String): XGBoostClassifier.this.type
-
def
setPredictionCol(value: String): XGBoostClassifier
- Definition Classes
- Predictor
-
def
setProbabilityCol(value: String): XGBoostClassifier
- Definition Classes
- ProbabilisticClassifier
- def setRateDrop(value: Double): XGBoostClassifier.this.type
-
def
setRawPredictionCol(value: String): XGBoostClassifier
- Definition Classes
- Classifier
- def setSampleType(value: String): XGBoostClassifier.this.type
- def setScalePosWeight(value: Double): XGBoostClassifier.this.type
- def setSeed(value: Long): XGBoostClassifier.this.type
- def setSilent(value: Int): XGBoostClassifier.this.type
- def setSketchEps(value: Double): XGBoostClassifier.this.type
- def setSkipDrop(value: Double): XGBoostClassifier.this.type
- def setSubsample(value: Double): XGBoostClassifier.this.type
-
def
setThresholds(value: Array[Double]): XGBoostClassifier
- Definition Classes
- ProbabilisticClassifier
- def setTimeoutRequestWorkers(value: Long): XGBoostClassifier.this.type
- def setTrainTestRatio(value: Double): XGBoostClassifier.this.type
- def setTreeMethod(value: String): XGBoostClassifier.this.type
- def setUseExternalMemory(value: Boolean): XGBoostClassifier.this.type
- def setWeightCol(value: String): XGBoostClassifier.this.type
-
final
val
silent: IntParam
Deprecated.
Deprecated. Please use verbosity instead. 0 means printing running messages, 1 means silent mode. default: 0
- Definition Classes
- GeneralParams
-
final
val
sketchEps: DoubleParam
This is only used for approximate greedy algorithm.
This is only used for approximate greedy algorithm. This roughly translated into O(1 / sketch_eps) number of bins. Compared to directly select number of bins, this comes with theoretical guarantee with sketch accuracy. [default=0.03] range: (0, 1)
- Definition Classes
- BoosterParams
-
final
val
skipCleanCheckpoint: BooleanParam
whether cleaning checkpoint, always cleaning by default, having this parameter majorly for testing
whether cleaning checkpoint, always cleaning by default, having this parameter majorly for testing
- Definition Classes
- LearningTaskParams
-
final
val
skipDrop: DoubleParam
Parameter of Dart booster.
Parameter of Dart booster. probability of skip dropout. If a dropout is skipped, new trees are added in the same manner as gbtree. [default=0.0] range: [0.0, 1.0]
- Definition Classes
- BoosterParams
-
final
val
subsample: DoubleParam
subsample ratio of the training instance.
subsample ratio of the training instance. Setting it to 0.5 means that XGBoost randomly collected half of the data instances to grow trees and this will prevent overfitting. [default=1] range:(0,1]
- Definition Classes
- BoosterParams
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
final
val
thresholds: DoubleArrayParam
- Definition Classes
- HasThresholds
-
final
val
timeoutRequestWorkers: LongParam
the maximum time to wait for the job requesting new workers.
the maximum time to wait for the job requesting new workers. default: 30 minutes
- Definition Classes
- GeneralParams
-
def
toString(): String
- Definition Classes
- Identifiable → AnyRef → Any
-
final
val
trackerConf: TrackerConfParam
Rabit tracker configurations.
Rabit tracker configurations. The parameter must be provided as an instance of the TrackerConf class, which has the following definition:
case class TrackerConf(workerConnectionTimeout: Duration, trainingTimeout: Duration, trackerImpl: String)
See below for detailed explanations.
- trackerImpl: Select the implementation of Rabit tracker. default: "python"
Choice between "python" or "scala". The former utilizes the Java wrapper of the Python Rabit tracker (in dmlc_core), and does not support timeout settings. The "scala" version removes Python components, and fully supports timeout settings.
- workerConnectionTimeout: the maximum wait time for all workers to connect to the tracker. default: 0 millisecond (no timeout)
The timeout value should take the time of data loading and pre-processing into account, due to the lazy execution of Spark's operations. Alternatively, you may force Spark to perform data transformation before calling XGBoost.train(), so that this timeout truly reflects the connection delay. Set a reasonable timeout value to prevent model training/testing from hanging indefinitely, possible due to network issues. Note that zero timeout value means to wait indefinitely (equivalent to Duration.Inf). Ignored if the tracker implementation is "python".
- Definition Classes
- GeneralParams
-
def
train(dataset: Dataset[_]): XGBoostClassificationModel
- Attributes
- protected
- Definition Classes
- XGBoostClassifier → Predictor
-
final
val
trainTestRatio: DoubleParam
Fraction of training points to use for testing.
Fraction of training points to use for testing.
- Definition Classes
- LearningTaskParams
-
def
transformSchema(schema: StructType): StructType
- Definition Classes
- Predictor → PipelineStage
-
def
transformSchema(schema: StructType, logging: Boolean): StructType
- Attributes
- protected
- Definition Classes
- PipelineStage
- Annotations
- @DeveloperApi()
-
final
val
treeLimit: IntParam
- Definition Classes
- BoosterParams
-
final
val
treeMethod: Param[String]
The tree construction algorithm used in XGBoost.
The tree construction algorithm used in XGBoost. options: {'auto', 'exact', 'approx'} [default='auto']
- Definition Classes
- BoosterParams
-
val
uid: String
- Definition Classes
- XGBoostClassifier → Identifiable
-
final
val
useExternalMemory: BooleanParam
whether to use external memory as cache.
whether to use external memory as cache. default: false
- Definition Classes
- GeneralParams
-
def
validateAndTransformSchema(schema: StructType, fitting: Boolean, featuresDataType: DataType): StructType
- Attributes
- protected
- Definition Classes
- ProbabilisticClassifierParams → ClassifierParams → PredictorParams
-
final
val
verbosity: IntParam
Verbosity of printing messages.
Verbosity of printing messages. Valid values are 0 (silent), 1 (warning), 2 (info), 3 (debug). default: 1
- Definition Classes
- GeneralParams
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
final
val
weightCol: Param[String]
- Definition Classes
- HasWeightCol
-
def
write: MLWriter
- Definition Classes
- DefaultParamsWritable → MLWritable