- abort(WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.DataSourceWriter
-
- abort() - Method in interface org.apache.spark.sql.sources.v2.writer.DataWriter
-
Aborts this writer if it is failed.
- abort(long, WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.streaming.StreamWriter
-
- abort(WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.streaming.StreamWriter
-
- abortJob(JobContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol
-
Aborts a job after the writes fail.
- abortJob(JobContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
-
- abortTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol
-
Aborts a task after the writes have failed.
- abortTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
-
- abs(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the absolute value of a numeric value.
- abs() - Method in class org.apache.spark.sql.types.Decimal
-
- absent() - Static method in class org.apache.spark.api.java.Optional
-
- AbsoluteError - Class in org.apache.spark.mllib.tree.loss
-
Developer API
Class for absolute error loss calculation (for regression).
- AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
-
- AbstractLauncher<T extends AbstractLauncher<T>> - Class in org.apache.spark.launcher
-
Base class for launcher implementations.
- accept(Parsers) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- accept(ES, Function1<ES, List<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- accept(String, PartialFunction<Object, U>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- accept(Path) - Method in class org.apache.spark.ml.image.SamplePathFilter
-
- acceptIf(Function1<Object, Object>, Function1<Object, String>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- acceptMatch(String, PartialFunction<Object, U>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- acceptSeq(ES, Function1<ES, Iterable<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- acceptsType(DataType) - Method in class org.apache.spark.sql.types.ObjectType
-
- accId() - Method in class org.apache.spark.CleanAccum
-
- accumCleaned(long) - Method in interface org.apache.spark.CleanerListener
-
- Accumulable<R,T> - Class in org.apache.spark
-
- Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
-
Deprecated.
- accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
-
- accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
-
- accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
-
- AccumulableInfo - Class in org.apache.spark.scheduler
-
Developer API
Information about an
Accumulable
modified during a task or stage.
- AccumulableInfo - Class in org.apache.spark.status.api.v1
-
- accumulableInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- accumulableInfoToJson(AccumulableInfo) - Static method in class org.apache.spark.util.JsonProtocol
-
- AccumulableParam<R,T> - Interface in org.apache.spark
-
- accumulables() - Method in class org.apache.spark.scheduler.StageInfo
-
Terminal values of accumulables updated during this stage, including all the user-defined
accumulators.
- accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
-
Intermediate updates to accumulables during this task.
- accumulablesToJson(Traversable<AccumulableInfo>) - Static method in class org.apache.spark.util.JsonProtocol
-
- Accumulator<T> - Class in org.apache.spark
-
- accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
- AccumulatorContext - Class in org.apache.spark.util
-
An internal class used to track accumulators by Spark itself.
- AccumulatorContext() - Constructor for class org.apache.spark.util.AccumulatorContext
-
- AccumulatorParam<T> - Interface in org.apache.spark
-
- AccumulatorParam.DoubleAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.FloatAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.IntAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.LongAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.StringAccumulatorParam$ - Class in org.apache.spark
-
- ACCUMULATORS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.StageData
-
- accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.TaskData
-
- AccumulatorV2<IN,OUT> - Class in org.apache.spark.util
-
The base class for accumulators, that can accumulate inputs of type IN
, and produce output of
type OUT
.
- AccumulatorV2() - Constructor for class org.apache.spark.util.AccumulatorV2
-
- accumUpdates() - Method in class org.apache.spark.ExceptionFailure
-
- accumUpdates() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- accumUpdates() - Method in class org.apache.spark.TaskKilled
-
- accuracy() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Returns accuracy.
- accuracy() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns accuracy
(equals to the total number of correctly classified instances
out of the total number of instances.)
- accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns accuracy
- acos(Column) - Static method in class org.apache.spark.sql.functions
-
- acos(String) - Static method in class org.apache.spark.sql.functions
-
- ActivationFunction - Interface in org.apache.spark.ml.ann
-
Trait for functions and their derivatives for functional layers
- active() - Static method in class org.apache.spark.sql.SparkSession
-
Returns the currently active SparkSession, otherwise the default one.
- active() - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
-
Returns a list of active queries associated with this SQLContext
- active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- ACTIVE() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
-
- activeStages() - Method in class org.apache.spark.status.LiveJob
-
- activeTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- activeTasks() - Method in class org.apache.spark.status.LiveExecutor
-
- activeTasks() - Method in class org.apache.spark.status.LiveJob
-
- activeTasks() - Method in class org.apache.spark.status.LiveStage
-
- activeTasksPerExecutor() - Method in class org.apache.spark.status.LiveStage
-
- add(T) - Method in class org.apache.spark.Accumulable
-
Deprecated.
Add more data to this accumulator / accumulable
- add(Vector) - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
-
Add a new training instance to this ExpectationAggregator, update the weights,
means and covariances for each distributions, and update the log likelihood.
- add(Datum) - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator
-
Add a single data point to this aggregator.
- add(AFTPoint) - Method in class org.apache.spark.ml.regression.AFTAggregator
-
Add a new training data to this AFTAggregator, and update the loss and gradient
of the objective function.
- add(double[], MultivariateGaussian[], ExpectationSum, Vector<Object>) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Adds a new document.
- add(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Adds the given block matrix other
to this
block matrix: this + other
.
- add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Add a new sample to this summarizer, and update the statistical summary.
- add(StructField) - Method in class org.apache.spark.sql.types.StructType
-
- add(String, DataType) - Method in class org.apache.spark.sql.types.StructType
-
Creates a new
StructType
by adding a new nullable field with no metadata.
- add(String, DataType, boolean) - Method in class org.apache.spark.sql.types.StructType
-
Creates a new
StructType
by adding a new field with no metadata.
- add(String, DataType, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType
-
Creates a new
StructType
by adding a new field and specifying metadata.
- add(String, DataType, boolean, String) - Method in class org.apache.spark.sql.types.StructType
-
Creates a new
StructType
by adding a new field and specifying metadata.
- add(String, String) - Method in class org.apache.spark.sql.types.StructType
-
Creates a new
StructType
by adding a new nullable field with no metadata where the
dataType is specified as a String.
- add(String, String, boolean) - Method in class org.apache.spark.sql.types.StructType
-
Creates a new
StructType
by adding a new field with no metadata where the
dataType is specified as a String.
- add(String, String, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType
-
Creates a new
StructType
by adding a new field and specifying metadata where the
dataType is specified as a String.
- add(String, String, boolean, String) - Method in class org.apache.spark.sql.types.StructType
-
Creates a new
StructType
by adding a new field and specifying metadata where the
dataType is specified as a String.
- add(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
- add(IN) - Method in class org.apache.spark.util.AccumulatorV2
-
Takes the inputs and accumulates.
- add(T) - Method in class org.apache.spark.util.CollectionAccumulator
-
- add(Double) - Method in class org.apache.spark.util.DoubleAccumulator
-
Adds v to the accumulator, i.e.
- add(double) - Method in class org.apache.spark.util.DoubleAccumulator
-
Adds v to the accumulator, i.e.
- add(T) - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
-
- add(Long) - Method in class org.apache.spark.util.LongAccumulator
-
Adds v to the accumulator, i.e.
- add(long) - Method in class org.apache.spark.util.LongAccumulator
-
Adds v to the accumulator, i.e.
- add(Object) - Method in class org.apache.spark.util.sketch.CountMinSketch
-
Increments item
's count by one.
- add(Object, long) - Method in class org.apache.spark.util.sketch.CountMinSketch
-
Increments item
's count by count
.
- add_months(Column, int) - Static method in class org.apache.spark.sql.functions
-
Returns the date that is numMonths
after startDate
.
- addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
-
Deprecated.
Add additional data to the accumulator value.
- addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
-
Deprecated.
- addAppArgs(String...) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Adds command line arguments for the application.
- addAppArgs(String...) - Method in class org.apache.spark.launcher.SparkLauncher
-
- addBinary(byte[]) - Method in class org.apache.spark.util.sketch.CountMinSketch
-
Increments item
's count by one.
- addBinary(byte[], long) - Method in class org.apache.spark.util.sketch.CountMinSketch
-
Increments item
's count by count
.
- addDirectory(String, File) - Method in interface org.apache.spark.rpc.RpcEnvFileServer
-
Adds a local directory to be served via this file server.
- addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFile(String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFile(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Adds a file to be submitted with the application.
- addFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- addFile(File) - Method in interface org.apache.spark.rpc.RpcEnvFileServer
-
Adds a file to be served by this RpcEnv.
- addFile(String) - Method in class org.apache.spark.SparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFile(String, boolean) - Method in class org.apache.spark.SparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFilters(Seq<ServletContextHandler>, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils
-
Add filters, if any, to the given list of ServletContextHandlers
- addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a param with multiple values (overwrites if the input param exists).
- addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a double param with multiple values.
- addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds an int param with multiple values.
- addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a float param with multiple values.
- addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a long param with multiple values.
- addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a boolean param with true and false.
- addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
-
Deprecated.
Merge two accumulated values together.
- addInPlace(double, double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-
Deprecated.
- addInPlace(float, float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-
Deprecated.
- addInPlace(int, int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-
Deprecated.
- addInPlace(long, long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-
Deprecated.
- addInPlace(String, String) - Method in class org.apache.spark.AccumulatorParam.StringAccumulatorParam$
-
Deprecated.
- addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
- addJar(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Adds a jar file to be submitted with the application.
- addJar(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- addJar(File) - Method in interface org.apache.spark.rpc.RpcEnvFileServer
-
Adds a jar to be served by this RpcEnv.
- addJar(String) - Method in class org.apache.spark.SparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext
in the future.
- addJar(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Add a jar into class loader
- addJar(String) - Method in class org.apache.spark.sql.hive.HiveSessionResourceLoader
-
- addListener(SparkAppHandle.Listener) - Method in interface org.apache.spark.launcher.SparkAppHandle
-
Adds a listener to be notified of changes to the handle's information.
- addListener(StreamingQueryListener) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
-
- addListener(L) - Method in interface org.apache.spark.util.ListenerBus
-
Add a listener to listen events.
- addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
-
Add Hadoop configuration specific to a single partition and attempt.
- addLong(long) - Method in class org.apache.spark.util.sketch.CountMinSketch
-
Increments item
's count by one.
- addLong(long, long) - Method in class org.apache.spark.util.sketch.CountMinSketch
-
Increments item
's count by count
.
- addMapOutput(int, MapStatus) - Method in class org.apache.spark.ShuffleStatus
-
Register a map output.
- addMetrics(TaskMetrics, TaskMetrics) - Static method in class org.apache.spark.status.LiveEntityHelpers
-
Add m2 values to m1.
- addPartition(LiveRDDPartition) - Method in class org.apache.spark.status.RDDPartitionSeq
-
- addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- addPyFile(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Adds a python file / zip / egg to be submitted with the application.
- addPyFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- address() - Method in class org.apache.spark.BarrierTaskInfo
-
- address() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-
- addSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
-
- addShutdownHook(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.ShutdownHookManager
-
Adds a shutdown hook with default priority.
- addShutdownHook(int, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.ShutdownHookManager
-
Adds a shutdown hook with the given priority.
- addSparkArg(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Adds a no-value argument to the Spark invocation.
- addSparkArg(String, String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Adds an argument with a value to the Spark invocation.
- addSparkArg(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- addSparkArg(String, String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- addSparkListener(SparkListenerInterface) - Method in class org.apache.spark.SparkContext
-
Developer API
Register a listener to receive up-calls from events that happen during execution.
- addSparkVersionMetadata(RecordWriter<NullWritable, Writable>) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
Add a metadata specifying Spark version.
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
-
- addString(String) - Method in class org.apache.spark.util.sketch.CountMinSketch
-
Increments item
's count by one.
- addString(String, long) - Method in class org.apache.spark.util.sketch.CountMinSketch
-
Increments item
's count by count
.
- addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.BarrierTaskContext
-
- addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
-
Adds a (Java friendly) listener to be executed on task completion.
- addTaskCompletionListener(Function1<TaskContext, U>) - Method in class org.apache.spark.TaskContext
-
Adds a listener in the form of a Scala closure to be executed on task completion.
- addTaskFailureListener(TaskFailureListener) - Method in class org.apache.spark.BarrierTaskContext
-
- addTaskFailureListener(TaskFailureListener) - Method in class org.apache.spark.TaskContext
-
Adds a listener to be executed on task failure.
- addTaskFailureListener(Function2<TaskContext, Throwable, BoxedUnit>) - Method in class org.apache.spark.TaskContext
-
Adds a listener to be executed on task failure.
- addTaskSetManager(Schedulable, Properties) - Method in interface org.apache.spark.scheduler.SchedulableBuilder
-
- addTime() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- addTime() - Method in class org.apache.spark.status.LiveExecutor
-
- addURL(URL) - Method in class org.apache.spark.util.MutableURLClassLoader
-
- AddWebUIFilter(String, Map<String, String>, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
-
- AddWebUIFilter$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
-
- AFTAggregator - Class in org.apache.spark.ml.regression
-
AFTAggregator computes the gradient and loss for a AFT loss function,
as used in AFT survival regression for samples in sparse or dense vector in an online fashion.
- AFTAggregator(Broadcast<DenseVector<Object>>, boolean, Broadcast<double[]>) - Constructor for class org.apache.spark.ml.regression.AFTAggregator
-
- AFTCostFun - Class in org.apache.spark.ml.regression
-
AFTCostFun implements Breeze's DiffFunction[T] for AFT cost.
- AFTCostFun(RDD<AFTPoint>, boolean, Broadcast<double[]>, int) - Constructor for class org.apache.spark.ml.regression.AFTCostFun
-
- AFTSurvivalRegression - Class in org.apache.spark.ml.regression
-
- AFTSurvivalRegression(String) - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- AFTSurvivalRegression() - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- AFTSurvivalRegressionModel - Class in org.apache.spark.ml.regression
-
- AFTSurvivalRegressionParams - Interface in org.apache.spark.ml.regression
-
Params for accelerated failure time (AFT) regression.
- agg(Column, Column...) - Method in class org.apache.spark.sql.Dataset
-
Aggregates on the entire Dataset without groups.
- agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.Dataset
-
(Scala-specific) Aggregates on the entire Dataset without groups.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.Dataset
-
(Scala-specific) Aggregates on the entire Dataset without groups.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.Dataset
-
(Java-specific) Aggregates on the entire Dataset without groups.
- agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Aggregates on the entire Dataset without groups.
- agg(TypedColumn<V, U1>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
Computes the given aggregation, returning a
Dataset
of tuples for each unique key
and the result of computing this aggregation over all elements in the group.
- agg(TypedColumn<V, U1>, TypedColumn<V, U2>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
Computes the given aggregations, returning a
Dataset
of tuples for each unique key
and the result of computing these aggregations over all elements in the group.
- agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
Computes the given aggregations, returning a
Dataset
of tuples for each unique key
and the result of computing these aggregations over all elements in the group.
- agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>, TypedColumn<V, U4>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
Computes the given aggregations, returning a
Dataset
of tuples for each unique key
and the result of computing these aggregations over all elements in the group.
- agg(Column, Column...) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute aggregates by specifying a series of aggregate columns.
- agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
(Scala-specific) Compute aggregates by specifying the column names and
aggregate methods.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
(Scala-specific) Compute aggregates by specifying a map from column name to
aggregate methods.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
(Java-specific) Compute aggregates by specifying a map from column name to
aggregate methods.
- agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute aggregates by specifying a series of aggregate columns.
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- AggregatedDialect - Class in org.apache.spark.sql.jdbc
-
AggregatedDialect can unify multiple dialects into one virtual Dialect.
- AggregatedDialect(List<JdbcDialect>) - Constructor for class org.apache.spark.sql.jdbc.AggregatedDialect
-
- aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
-
Aggregates values from the neighboring edges and vertices of each vertex.
- aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Aggregates vertices in messages
that have the same ids using reduceFunc
, returning a
VertexRDD co-indexed with this
.
- AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
-
- AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- aggregationDepth() - Method in interface org.apache.spark.ml.param.shared.HasAggregationDepth
-
Param for suggested depth for treeAggregate (>= 2).
- Aggregator<K,V,C> - Class in org.apache.spark
-
Developer API
A set of functions used to aggregate data.
- Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
-
- aggregator() - Method in class org.apache.spark.ShuffleDependency
-
- Aggregator<IN,BUF,OUT> - Class in org.apache.spark.sql.expressions
-
Experimental
A base class for user-defined aggregations, which can be used in Dataset
operations to take
all of the elements of a group and reduce them to a single value.
- Aggregator() - Constructor for class org.apache.spark.sql.expressions.Aggregator
-
- aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
-
- aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
-
- aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
-
- aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
-
- aic() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
Akaike Information Criterion (AIC) for the fitted model.
- Algo - Class in org.apache.spark.mllib.tree.configuration
-
Enum to select the algorithm for the decision tree
- Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
-
- algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
- alias(String) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias.
- alias(String) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with an alias set.
- alias(Symbol) - Method in class org.apache.spark.sql.Dataset
-
(Scala-specific) Returns a new Dataset with an alias set.
- All - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose all the fields (source, edge, and destination).
- AllJobsCancelled - Class in org.apache.spark.scheduler
-
- AllJobsCancelled() - Constructor for class org.apache.spark.scheduler.AllJobsCancelled
-
- allocator() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
-
- AllReceiverIds - Class in org.apache.spark.streaming.scheduler
-
A message used by ReceiverTracker to ask all receiver's ids still stored in
ReceiverTrackerEndpoint.
- AllReceiverIds() - Constructor for class org.apache.spark.streaming.scheduler.AllReceiverIds
-
- allSources() - Static method in class org.apache.spark.metrics.source.StaticSources
-
The set of all static sources.
- alpha() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for the alpha parameter in the implicit preference formulation (nonnegative).
- alpha() - Method in class org.apache.spark.mllib.random.WeibullGenerator
-
- ALS - Class in org.apache.spark.ml.recommendation
-
Alternating Least Squares (ALS) matrix factorization.
- ALS(String) - Constructor for class org.apache.spark.ml.recommendation.ALS
-
- ALS() - Constructor for class org.apache.spark.ml.recommendation.ALS
-
- ALS - Class in org.apache.spark.mllib.recommendation
-
Alternating Least Squares matrix factorization.
- ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
-
Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10,
lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
- ALS.InBlock$ - Class in org.apache.spark.ml.recommendation
-
- ALS.LeastSquaresNESolver - Interface in org.apache.spark.ml.recommendation
-
Trait for least squares solvers applied to the normal equation.
- ALS.Rating<ID> - Class in org.apache.spark.ml.recommendation
-
Developer API
Rating class for better code readability.
- ALS.Rating$ - Class in org.apache.spark.ml.recommendation
-
- ALS.RatingBlock$ - Class in org.apache.spark.ml.recommendation
-
- ALSModel - Class in org.apache.spark.ml.recommendation
-
Model fitted by ALS.
- ALSModelParams - Interface in org.apache.spark.ml.recommendation
-
Common params for ALS and ALSModel.
- ALSParams - Interface in org.apache.spark.ml.recommendation
-
Common params for ALS.
- alterDatabase(CatalogDatabase) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Alter a database whose name matches the one specified in database
, assuming it exists.
- alterFunction(String, CatalogFunction) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Alter a function whose name matches the one specified in `func`, assuming it exists.
- alterPartitions(String, String, Seq<CatalogTablePartition>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Alter one or more table partitions whose specs match the ones specified in newParts
,
assuming the partitions exist.
- alterTable(CatalogTable) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Alter a table whose name matches the one specified in `table`, assuming it exists.
- alterTable(String, String, CatalogTable) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Updates the given table with new metadata, optionally renaming the table or
moving across different database.
- alterTableDataSchema(String, String, StructType, Map<String, String>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Updates the given table with a new data schema and table properties, and keep everything else
unchanged.
- am() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager
-
- AnalysisException - Exception in org.apache.spark.sql
-
Thrown when a query fails to analyze, usually because the query itself is invalid.
- and(Column) - Method in class org.apache.spark.sql.Column
-
Boolean AND.
- And - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff both left
or right
evaluate to true
.
- And(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.And
-
- antecedent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
-
- ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- AnyDataType - Class in org.apache.spark.sql.types
-
An AbstractDataType
that matches any concrete data types.
- AnyDataType() - Constructor for class org.apache.spark.sql.types.AnyDataType
-
- anyNull() - Method in interface org.apache.spark.sql.Row
-
Returns true if there are any NULL values in this row.
- anyNull() - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- ApiHelper - Class in org.apache.spark.ui.jobs
-
- ApiHelper() - Constructor for class org.apache.spark.ui.jobs.ApiHelper
-
- ApiRequestContext - Interface in org.apache.spark.status.api.v1
-
- appAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- Append() - Static method in class org.apache.spark.sql.streaming.OutputMode
-
OutputMode in which only the new rows in the streaming DataFrame/Dataset will be
written to the sink.
- appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Returns a new vector with 1.0
(bias) appended to the input vector.
- appendColumn(StructType, String, DataType, boolean) - Static method in class org.apache.spark.ml.util.SchemaUtils
-
Appends a new column to the input schema.
- appendColumn(StructType, StructField) - Static method in class org.apache.spark.ml.util.SchemaUtils
-
Appends a new column to the input schema.
- appendReadColumns(Configuration, Seq<Integer>, Seq<String>) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- AppHistoryServerPlugin - Interface in org.apache.spark.status
-
An interface for creating history listeners(to replay event logs) defined in other modules like
SQL, and setup the UI of the plugin to rebuild the history UI.
- appId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- appId() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- appId() - Method in interface org.apache.spark.status.api.v1.BaseAppResource
-
- APPLICATION_EXECUTOR_LIMIT() - Static method in class org.apache.spark.ui.ToolTips
-
- applicationAttemptId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
Get the attempt ID for this run, if the cluster manager supports multiple
attempts.
- applicationAttemptId() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
Get an application's attempt ID associated with the job.
- applicationAttemptId() - Method in class org.apache.spark.SparkContext
-
- ApplicationAttemptInfo - Class in org.apache.spark.status.api.v1
-
- applicationEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- applicationEndToJson(SparkListenerApplicationEnd) - Static method in class org.apache.spark.util.JsonProtocol
-
- ApplicationEnvironmentInfo - Class in org.apache.spark.status.api.v1
-
- applicationId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
Get an application ID associated with the job.
- applicationId() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
Get an application ID associated with the job.
- applicationId() - Method in class org.apache.spark.SparkContext
-
A unique identifier for the Spark application.
- ApplicationInfo - Class in org.apache.spark.status.api.v1
-
- applicationStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- applicationStartToJson(SparkListenerApplicationStart) - Static method in class org.apache.spark.util.JsonProtocol
-
- ApplicationStatus - Enum in org.apache.spark.status.api.v1
-
- apply(T1) - Static method in class org.apache.spark.CleanAccum
-
- apply(T1) - Static method in class org.apache.spark.CleanBroadcast
-
- apply(T1) - Static method in class org.apache.spark.CleanCheckpoint
-
- apply(T1) - Static method in class org.apache.spark.CleanRDD
-
- apply(T1) - Static method in class org.apache.spark.CleanShuffle
-
- apply(T1, T2) - Static method in class org.apache.spark.ContextBarrierId
-
- apply(T1, T2, T3, T4, T5, T6, T7) - Static method in class org.apache.spark.ExceptionFailure
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.ExecutorLostFailure
-
- apply(T1) - Static method in class org.apache.spark.ExecutorRegistered
-
- apply(T1) - Static method in class org.apache.spark.ExecutorRemoved
-
- apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.FetchFailed
-
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of vertices and
edges with attributes.
- apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from edges, setting referenced vertices to defaultVertexAttr
.
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from vertices and edges, setting missing vertices to defaultVertexAttr
.
- apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
- apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel
-
Execute a Pregel-like iterative vertex-parallel abstraction.
- apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a standalone
VertexRDD
(one that is not set up for efficient joins with an
EdgeRDD
) from an RDD of vertex-attribute pairs.
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
from an RDD of vertex-attribute pairs.
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
from an RDD of vertex-attribute pairs.
- apply(DenseMatrix<Object>, DenseMatrix<Object>, Function1<Object, Object>) - Static method in class org.apache.spark.ml.ann.ApplyInPlace
-
- apply(DenseMatrix<Object>, DenseMatrix<Object>, DenseMatrix<Object>, Function2<Object, Object, Object>) - Static method in class org.apache.spark.ml.ann.ApplyInPlace
-
- apply(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Gets an attribute by its name.
- apply(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Gets an attribute by its index.
- apply(T1, T2) - Static method in class org.apache.spark.ml.clustering.ClusterData
-
- apply(T1, T2) - Static method in class org.apache.spark.ml.feature.LabeledPoint
-
- apply(int, int) - Method in class org.apache.spark.ml.linalg.DenseMatrix
-
- apply(int) - Method in class org.apache.spark.ml.linalg.DenseVector
-
- apply(int, int) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Gets the (i, j)-th element.
- apply(int, int) - Method in class org.apache.spark.ml.linalg.SparseMatrix
-
- apply(int) - Method in class org.apache.spark.ml.linalg.SparseVector
-
- apply(int) - Method in interface org.apache.spark.ml.linalg.Vector
-
Gets the value of the ith element.
- apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
-
Gets the value of the input param or its default value if it does not exist.
- apply(GeneralizedLinearRegressionBase) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.FamilyAndLink$
-
Constructs the FamilyAndLink object from a parameter map
- apply(Split) - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData$
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
-
- apply(T1, T2, T3, T4) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
-
- apply(BinaryConfusionMatrix) - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryClassificationMetricComputer
-
- apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
-
- apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Precision
-
- apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Recall
-
- apply(T1) - Static method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data
-
- apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.mllib.feature.VocabWord
-
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- apply(T1, T2) - Static method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
-
- apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Gets the (i, j)-th element.
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- apply(int) - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
-
Gets the value of the ith element.
- apply(T1, T2, T3) - Static method in class org.apache.spark.mllib.recommendation.Rating
-
- apply(T1, T2) - Static method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
-
- apply(T1, T2) - Static method in class org.apache.spark.mllib.stat.test.BinarySample
-
- apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
-
- apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- apply(int, Node) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData$
-
- apply(Row) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData$
-
- apply(int, Node) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- apply(Row) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- apply(Predict) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData$
-
- apply(Row) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData$
-
- apply(Predict) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- apply(Row) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- apply(Split) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData$
-
- apply(Row) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData$
-
- apply(Split) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- apply(Row) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
- apply(T1, T2, T3, T4) - Static method in class org.apache.spark.mllib.tree.model.Split
-
- apply(int) - Static method in class org.apache.spark.rdd.CheckpointState
-
- apply(int) - Static method in class org.apache.spark.rdd.DeterministicLevel
-
- apply(long, String, Option<String>, String, boolean) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(T1, T2, T3, T4) - Static method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- apply(T1, T2) - Static method in class org.apache.spark.scheduler.BlacklistedExecutor
-
- apply(String, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
-
Alternate factory method that takes a ByteBuffer directly for the data field
- apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.local.KillTask
-
- apply() - Static method in class org.apache.spark.scheduler.local.ReviveOffers
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.local.StatusUpdate
-
- apply() - Static method in class org.apache.spark.scheduler.local.StopExecutor
-
- apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
-
- apply(int) - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- apply(T1, T2, T3, T4, T5, T6) - Static method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-
- apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerBlockUpdated
-
- apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
-
- apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
-
- apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerJobEnd
-
- apply(T1, T2, T3, T4) - Static method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerLogStart
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
-
- apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
-
- apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
-
- apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerSpeculativeTaskSubmitted
-
- apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- apply(T1, T2, T3, T4, T5, T6) - Static method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- apply(int) - Static method in class org.apache.spark.scheduler.TaskLocality
-
- apply(Object) - Method in class org.apache.spark.sql.Column
-
Extracts a value or values from a complex type.
- apply(String) - Method in class org.apache.spark.sql.Dataset
-
Selects column based on the column name and returns it as a
Column
.
- apply(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
-
Creates a Column
for this UDAF using given Column
s as input arguments.
- apply(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
-
Creates a Column
for this UDAF using given Column
s as input arguments.
- apply(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
-
Returns an expression that invokes the UDF, using the given arguments.
- apply(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
-
Returns an expression that invokes the UDF, using the given arguments.
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.DetermineTableStats
-
- apply(T1, T2, T3, T4) - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
-
- apply(ScriptInputOutputSchema) - Static method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
-
- apply(T1, T2, T3, T4, T5, T6) - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
-
- apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.HiveAnalysis
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans$
-
- apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.Scripts$
-
- apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.HiveStrategies.Scripts
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.hive.HiveUDAFBuffer
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.RelationConversions
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.ResolveHiveSerdeTable
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.jdbc.JdbcType
-
- apply(Dataset<Row>, Seq<Expression>, RelationalGroupedDataset.GroupType) - Static method in class org.apache.spark.sql.RelationalGroupedDataset
-
- apply(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i.
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.And
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.EqualNullSafe
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.EqualTo
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.GreaterThan
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.In
-
- apply(T1) - Static method in class org.apache.spark.sql.sources.IsNotNull
-
- apply(T1) - Static method in class org.apache.spark.sql.sources.IsNull
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.LessThan
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.LessThanOrEqual
-
- apply(T1) - Static method in class org.apache.spark.sql.sources.Not
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.Or
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.StringContains
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.StringEndsWith
-
- apply(T1, T2) - Static method in class org.apache.spark.sql.sources.StringStartsWith
-
- apply(String) - Static method in class org.apache.spark.sql.streaming.ProcessingTime
-
- apply(Duration) - Static method in class org.apache.spark.sql.streaming.ProcessingTime
-
- apply(DataType) - Static method in class org.apache.spark.sql.types.ArrayType
-
Construct a
ArrayType
object with the given element type.
- apply(T1) - Static method in class org.apache.spark.sql.types.CharType
-
- apply(double) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(long) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(int) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(BigInteger) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(BigInt) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(String) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(DataType, DataType) - Static method in class org.apache.spark.sql.types.MapType
-
Construct a
MapType
object with the given key type and value type.
- apply(T1, T2, T3, T4) - Static method in class org.apache.spark.sql.types.StructField
-
- apply(String) - Method in class org.apache.spark.sql.types.StructType
-
- apply(Set<String>) - Method in class org.apache.spark.sql.types.StructType
-
Returns a
StructType
containing
StructField
s of the given names, preserving the
original order of fields.
- apply(int) - Method in class org.apache.spark.sql.types.StructType
-
- apply(T1) - Static method in class org.apache.spark.sql.types.VarcharType
-
- apply(T1, T2, T3, T4, T5, T6, T7, T8) - Static method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- apply(T1, T2, T3, T4, T5, T6, T7) - Static method in class org.apache.spark.status.api.v1.ApplicationInfo
-
- apply(T1) - Static method in class org.apache.spark.status.api.v1.StackTrace
-
- apply(T1, T2, T3, T4, T5, T6, T7) - Static method in class org.apache.spark.status.api.v1.ThreadStackTrace
-
- apply(int) - Method in class org.apache.spark.status.RDDPartitionSeq
-
- apply(String) - Static method in class org.apache.spark.storage.BlockId
-
- apply(String, String, int, Option<String>) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(T1, T2) - Static method in class org.apache.spark.storage.BroadcastBlockId
-
- apply(T1, T2) - Static method in class org.apache.spark.storage.RDDBlockId
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.storage.ShuffleBlockId
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.storage.ShuffleDataBlockId
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
Developer API
Create a new StorageLevel object.
- apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
Developer API
Create a new StorageLevel object without setting useOffHeap.
- apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
-
Developer API
Create a new StorageLevel object from its integer representation.
- apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
-
Developer API
Read StorageLevel object from ObjectInput stream.
- apply(T1, T2) - Static method in class org.apache.spark.storage.StreamBlockId
-
- apply(T1) - Static method in class org.apache.spark.storage.TaskResultBlockId
-
- apply(T1) - Static method in class org.apache.spark.streaming.Duration
-
- apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
-
- apply(long) - Static method in class org.apache.spark.streaming.Minutes
-
- apply(T1, T2, T3, T4, T5, T6) - Static method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- apply(T1, T2, T3, T4, T5, T6, T7) - Static method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-
- apply(T1, T2, T3, T4, T5, T6, T7, T8) - Static method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- apply(int) - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
-
- apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
-
- apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
-
- apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
-
- apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
-
- apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
-
- apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerStreamingStarted
-
- apply(long) - Static method in class org.apache.spark.streaming.Seconds
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.TaskCommitDenied
-
- apply(T1, T2, T3) - Static method in class org.apache.spark.TaskKilled
-
- apply(int) - Static method in class org.apache.spark.TaskState
-
- apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values.
- apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values passed as variable-length arguments.
- ApplyInPlace - Class in org.apache.spark.ml.ann
-
Implements in-place application of functions in the arrays
- ApplyInPlace() - Constructor for class org.apache.spark.ml.ann.ApplyInPlace
-
- applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- applySchema(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
- applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
- appName() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- appName() - Method in class org.apache.spark.SparkContext
-
- appName(String) - Method in class org.apache.spark.sql.SparkSession.Builder
-
Sets a name for the application, which will be shown in the Spark web UI.
- approx_count_distinct(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approx_count_distinct(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approx_count_distinct(Column, double) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approx_count_distinct(String, double) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approxCountDistinct(Column) - Static method in class org.apache.spark.sql.functions
-
- approxCountDistinct(String) - Static method in class org.apache.spark.sql.functions
-
- approxCountDistinct(Column, double) - Static method in class org.apache.spark.sql.functions
-
- approxCountDistinct(String, double) - Static method in class org.apache.spark.sql.functions
-
- ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- ApproximateEvaluator<U,R> - Interface in org.apache.spark.partial
-
An object that computes a function incrementally by merging in results of type U from multiple
tasks.
- approxQuantile(String, double[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Calculates the approximate quantiles of a numerical column of a DataFrame.
- approxQuantile(String[], double[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Calculates the approximate quantiles of numerical columns of a DataFrame.
- appSparkVersion() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- AppStatusUtils - Class in org.apache.spark.status
-
- AppStatusUtils() - Constructor for class org.apache.spark.status.AppStatusUtils
-
- AreaUnderCurve - Class in org.apache.spark.mllib.evaluation
-
Computes the area under the curve (AUC) using the trapezoidal rule.
- AreaUnderCurve() - Constructor for class org.apache.spark.mllib.evaluation.AreaUnderCurve
-
- areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the precision-recall curve.
- areaUnderROC() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-
Computes the area under the receiver operating characteristic (ROC) curve.
- areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the receiver operating characteristic (ROC) curve.
- argmax() - Method in class org.apache.spark.ml.linalg.DenseVector
-
- argmax() - Method in class org.apache.spark.ml.linalg.SparseVector
-
- argmax() - Method in interface org.apache.spark.ml.linalg.Vector
-
Find the index of a maximal element.
- argmax() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- argmax() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- argmax() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Find the index of a maximal element.
- argString() - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
-
- array(DataType) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type array.
- array(Column...) - Static method in class org.apache.spark.sql.functions
-
Creates a new array column.
- array(String, String...) - Static method in class org.apache.spark.sql.functions
-
Creates a new array column.
- array(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Creates a new array column.
- array(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Creates a new array column.
- array() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- array_contains(Column, Object) - Static method in class org.apache.spark.sql.functions
-
Returns null if the array is null, true if the array contains value
, and false otherwise.
- array_distinct(Column) - Static method in class org.apache.spark.sql.functions
-
Removes duplicate values from the array.
- array_except(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns an array of the elements in the first array but not in the second array,
without duplicates.
- array_intersect(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns an array of the elements in the intersection of the given two arrays,
without duplicates.
- array_join(Column, String, String) - Static method in class org.apache.spark.sql.functions
-
Concatenates the elements of column
using the delimiter
.
- array_join(Column, String) - Static method in class org.apache.spark.sql.functions
-
Concatenates the elements of column
using the delimiter
.
- array_max(Column) - Static method in class org.apache.spark.sql.functions
-
Returns the maximum value in the array.
- array_min(Column) - Static method in class org.apache.spark.sql.functions
-
Returns the minimum value in the array.
- array_position(Column, Object) - Static method in class org.apache.spark.sql.functions
-
Locates the position of the first occurrence of the value in the given array as long.
- array_remove(Column, Object) - Static method in class org.apache.spark.sql.functions
-
Remove all elements that equal to element from the given array.
- array_repeat(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Creates an array containing the left argument repeated the number of times given by the
right argument.
- array_repeat(Column, int) - Static method in class org.apache.spark.sql.functions
-
Creates an array containing the left argument repeated the number of times given by the
right argument.
- array_sort(Column) - Static method in class org.apache.spark.sql.functions
-
Sorts the input array in ascending order.
- array_union(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns an array of the elements in the union of the given two arrays, without duplicates.
- arrayLengthGt(double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check that the array length is greater than lowerBound.
- arrays_overlap(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns true
if a1
and a2
have at least one non-null element in common.
- arrays_zip(Column...) - Static method in class org.apache.spark.sql.functions
-
Returns a merged array of structs in which the N-th struct contains all N-th values of input
arrays.
- arrays_zip(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Returns a merged array of structs in which the N-th struct contains all N-th values of input
arrays.
- ArrayType - Class in org.apache.spark.sql.types
-
- ArrayType(DataType, boolean) - Constructor for class org.apache.spark.sql.types.ArrayType
-
- arrayValues() - Method in class org.apache.spark.storage.memory.DeserializedValuesHolder
-
- ArrowColumnVector - Class in org.apache.spark.sql.vectorized
-
A column vector backed by Apache Arrow.
- ArrowColumnVector(ValueVector) - Constructor for class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- as(Encoder<U>) - Method in class org.apache.spark.sql.Column
-
Provides a type hint about the expected return value of this column.
- as(String) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias.
- as(Seq<String>) - Method in class org.apache.spark.sql.Column
-
(Scala-specific) Assigns the given aliases to the results of a table generating function.
- as(String[]) - Method in class org.apache.spark.sql.Column
-
Assigns the given aliases to the results of a table generating function.
- as(Symbol) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias.
- as(String, Metadata) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias with metadata.
- as(Encoder<U>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
Returns a new Dataset where each record has been mapped on to the specified type.
- as(String) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with an alias set.
- as(Symbol) - Method in class org.apache.spark.sql.Dataset
-
(Scala-specific) Returns a new Dataset with an alias set.
- asBinary() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Convenient method for casting to binary logistic regression summary.
- asBreeze() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts to a breeze matrix.
- asBreeze() - Method in interface org.apache.spark.ml.linalg.Vector
-
Converts the instance to a breeze vector.
- asBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Converts to a breeze matrix.
- asBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts the instance to a breeze vector.
- asc() - Method in class org.apache.spark.sql.Column
-
Returns a sort expression based on ascending order of the column.
- asc(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on ascending order of the column.
- asc_nulls_first() - Method in class org.apache.spark.sql.Column
-
Returns a sort expression based on ascending order of the column,
and null values return before non-null values.
- asc_nulls_first(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on ascending order of the column,
and null values return before non-null values.
- asc_nulls_last() - Method in class org.apache.spark.sql.Column
-
Returns a sort expression based on ascending order of the column,
and null values appear after non-null values.
- asc_nulls_last(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on ascending order of the column,
and null values appear after non-null values.
- ascii(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the numeric value of the first character of the string column, and returns the
result as an int column.
- asin(Column) - Static method in class org.apache.spark.sql.functions
-
- asin(String) - Static method in class org.apache.spark.sql.functions
-
- asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
-
Read the elements of this stream through an iterator.
- asJavaPairRDD() - Method in class org.apache.spark.api.r.PairwiseRRDD
-
- asJavaRDD() - Method in class org.apache.spark.api.r.RRDD
-
- asJavaRDD() - Method in class org.apache.spark.api.r.StringRRDD
-
- asKeyValueIterator() - Method in class org.apache.spark.serializer.DeserializationStream
-
Read the elements of this stream through an iterator over key-value pairs.
- AskPermissionToCommitOutput - Class in org.apache.spark.scheduler
-
- AskPermissionToCommitOutput(int, int, int, int) - Constructor for class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- askRpcTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
-
Returns the default Spark timeout to use for RPC ask operations.
- askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
-
- askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
-
- asMap() - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
- asML() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- asML() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- asML() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Convert this matrix to the new mllib-local representation.
- asML() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- asML() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- asML() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Convert this vector to the new mllib-local representation.
- asNondeterministic() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
-
Updates UserDefinedFunction to nondeterministic.
- asNonNullable() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
-
Updates UserDefinedFunction to non-nullable.
- asNullable() - Method in class org.apache.spark.sql.types.ObjectType
-
- asRDDId() - Method in class org.apache.spark.storage.BlockId
-
- assertConf(JobContext, SparkConf) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
-
- assertNotSpilled(SparkContext, String, Function0<BoxedUnit>) - Static method in class org.apache.spark.TestUtils
-
Run some code involving jobs submitted to the given context and assert that the jobs
did not spill.
- assertSpilled(SparkContext, String, Function0<BoxedUnit>) - Static method in class org.apache.spark.TestUtils
-
Run some code involving jobs submitted to the given context and assert that the jobs spilled.
- assignClusters(Dataset<?>) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
Run the PIC algorithm and returns a cluster assignment for each input vertex.
- Assignment(long, int) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-
- Assignment$() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$
-
- assignments() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- AssociationRules - Class in org.apache.spark.ml.fpm
-
- AssociationRules() - Constructor for class org.apache.spark.ml.fpm.AssociationRules
-
- associationRules() - Method in class org.apache.spark.ml.fpm.FPGrowthModel
-
Get association rules fitted using the minConfidence.
- AssociationRules - Class in org.apache.spark.mllib.fpm
-
Generates association rules from a RDD[FreqItemset[Item}
.
- AssociationRules() - Constructor for class org.apache.spark.mllib.fpm.AssociationRules
-
Constructs a default instance with default parameters {minConfidence = 0.8}.
- AssociationRules.Rule<Item> - Class in org.apache.spark.mllib.fpm
-
An association rule between sets of items.
- ASYNC_TRACKING_ENABLED() - Static method in class org.apache.spark.status.config
-
- AsyncEventQueue - Class in org.apache.spark.scheduler
-
An asynchronous queue for events.
- AsyncEventQueue(String, SparkConf, LiveListenerBusMetrics, LiveListenerBus) - Constructor for class org.apache.spark.scheduler.AsyncEventQueue
-
- AsyncRDDActions<T> - Class in org.apache.spark.rdd
-
A set of asynchronous RDD actions available through an implicit conversion.
- AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
-
- atan(Column) - Static method in class org.apache.spark.sql.functions
-
- atan(String) - Static method in class org.apache.spark.sql.functions
-
- atan2(Column, Column) - Static method in class org.apache.spark.sql.functions
-
- atan2(Column, String) - Static method in class org.apache.spark.sql.functions
-
- atan2(String, Column) - Static method in class org.apache.spark.sql.functions
-
- atan2(String, String) - Static method in class org.apache.spark.sql.functions
-
- atan2(Column, double) - Static method in class org.apache.spark.sql.functions
-
- atan2(String, double) - Static method in class org.apache.spark.sql.functions
-
- atan2(double, Column) - Static method in class org.apache.spark.sql.functions
-
- atan2(double, String) - Static method in class org.apache.spark.sql.functions
-
- attempt() - Method in class org.apache.spark.status.api.v1.TaskData
-
- ATTEMPT() - Static method in class org.apache.spark.status.TaskIndexNames
-
- attemptId() - Method in class org.apache.spark.scheduler.StageInfo
-
- attemptId() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- attemptId() - Method in interface org.apache.spark.status.api.v1.BaseAppResource
-
- attemptId() - Method in class org.apache.spark.status.api.v1.StageData
-
- attemptNumber() - Method in class org.apache.spark.BarrierTaskContext
-
- attemptNumber() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- attemptNumber() - Method in class org.apache.spark.scheduler.StageInfo
-
- attemptNumber() - Method in class org.apache.spark.scheduler.TaskInfo
-
- attemptNumber() - Method in class org.apache.spark.TaskCommitDenied
-
- attemptNumber() - Method in class org.apache.spark.TaskContext
-
How many times this task has been attempted.
- attempts() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-
- AtTimestamp(Date) - Constructor for class org.apache.spark.streaming.kinesis.KinesisInitialPositions.AtTimestamp
-
- attr() - Method in class org.apache.spark.graphx.Edge
-
- attr() - Method in class org.apache.spark.graphx.EdgeContext
-
The attribute associated with the edge.
- attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- Attribute - Class in org.apache.spark.ml.attribute
-
Developer API
Abstract class for ML attributes.
- Attribute() - Constructor for class org.apache.spark.ml.attribute.Attribute
-
- attribute() - Method in class org.apache.spark.sql.sources.EqualNullSafe
-
- attribute() - Method in class org.apache.spark.sql.sources.EqualTo
-
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
-
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- attribute() - Method in class org.apache.spark.sql.sources.In
-
- attribute() - Method in class org.apache.spark.sql.sources.IsNotNull
-
- attribute() - Method in class org.apache.spark.sql.sources.IsNull
-
- attribute() - Method in class org.apache.spark.sql.sources.LessThan
-
- attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
-
- attribute() - Method in class org.apache.spark.sql.sources.StringContains
-
- attribute() - Method in class org.apache.spark.sql.sources.StringEndsWith
-
- attribute() - Method in class org.apache.spark.sql.sources.StringStartsWith
-
- AttributeFactory - Interface in org.apache.spark.ml.attribute
-
Trait for ML attribute factories.
- AttributeGroup - Class in org.apache.spark.ml.attribute
-
Developer API
Attributes that describe a vector ML column.
- AttributeGroup(String) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
-
Creates an attribute group without attribute info.
- AttributeGroup(String, int) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
-
Creates an attribute group knowing only the number of attributes.
- AttributeGroup(String, Attribute[]) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
-
Creates an attribute group with attributes.
- AttributeKeys - Class in org.apache.spark.ml.attribute
-
Keys used to store attributes.
- AttributeKeys() - Constructor for class org.apache.spark.ml.attribute.AttributeKeys
-
- attributes() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Optional array of attributes.
- ATTRIBUTES() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
-
- AttributeType - Class in org.apache.spark.ml.attribute
-
Developer API
An enum-like type for attribute types: AttributeType$.Numeric
, AttributeType$.Nominal
,
and AttributeType$.Binary
.
- AttributeType(String) - Constructor for class org.apache.spark.ml.attribute.AttributeType
-
- attrType() - Method in class org.apache.spark.ml.attribute.Attribute
-
Attribute type.
- attrType() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
-
- attrType() - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
- attrType() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- attrType() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- available() - Method in class org.apache.spark.io.NioBufferedFileInputStream
-
- available() - Method in class org.apache.spark.io.ReadAheadInputStream
-
- available() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-
- Average() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
-
- avg(MapFunction<T, Double>) - Static method in class org.apache.spark.sql.expressions.javalang.typed
-
Average aggregate function.
- avg(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed
-
Average aggregate function.
- avg(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- avg(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- avg(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the mean value for each numeric columns for each group.
- avg(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the mean value for each numeric columns for each group.
- avg() - Method in class org.apache.spark.util.DoubleAccumulator
-
Returns the average of elements added to the accumulator.
- avg() - Method in class org.apache.spark.util.LongAccumulator
-
Returns the average of elements added to the accumulator.
- avgEventRate() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
-
- avgInputRate() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
-
- avgMetrics() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- avgProcessingTime() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
-
- avgSchedulingDelay() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
-
- avgTotalDelay() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
-
- awaitAnyTermination() - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
-
Wait until any of the queries on the associated SQLContext has terminated since the
creation of the context, or since resetTerminated()
was called.
- awaitAnyTermination(long) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
-
Wait until any of the queries on the associated SQLContext has terminated since the
creation of the context, or since resetTerminated()
was called.
- awaitReady(Awaitable<T>, Duration) - Static method in class org.apache.spark.util.ThreadUtils
-
Preferred alternative to Await.ready()
.
- awaitResult(Awaitable<T>, Duration) - Static method in class org.apache.spark.util.ThreadUtils
-
Preferred alternative to Await.result()
.
- awaitTermination() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
Waits for the termination of this
query, either by query.stop()
or by an exception.
- awaitTermination(long) - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
Waits for the termination of this
query, either by query.stop()
or by an exception.
- awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- axpy(double, Vector, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS
-
y += a * x
- axpy(double, Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
y += a * x
- cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Persist this RDD with the default storage level (MEMORY_ONLY
).
- cache() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Persist this RDD with the default storage level (MEMORY_ONLY
).
- cache() - Method in class org.apache.spark.api.java.JavaRDD
-
Persist this RDD with the default storage level (MEMORY_ONLY
).
- cache() - Method in class org.apache.spark.graphx.Graph
-
Caches the vertices and edges associated with this graph at the previously-specified target
storage levels, which default to MEMORY_ONLY
.
- cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
Persists the edge partitions using targetStorageLevel
, which defaults to MEMORY_ONLY.
- cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
Persists the vertex partitions at targetStorageLevel
, which defaults to MEMORY_ONLY.
- cache() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Caches the underlying RDD.
- cache() - Method in class org.apache.spark.rdd.RDD
-
Persist this RDD with the default storage level (MEMORY_ONLY
).
- cache() - Method in class org.apache.spark.sql.Dataset
-
Persist this Dataset with the default storage level (MEMORY_AND_DISK
).
- cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cacheNodeIds() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
If false, the algorithm will pass trees to executors to match instances with nodes.
- cacheSize() - Method in interface org.apache.spark.SparkExecutorInfo
-
- cacheSize() - Method in class org.apache.spark.SparkExecutorInfoImpl
-
- cacheTable(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Caches the specified table in-memory.
- cacheTable(String, StorageLevel) - Method in class org.apache.spark.sql.catalog.Catalog
-
Caches the specified table with the given storage level.
- cacheTable(String) - Method in class org.apache.spark.sql.SQLContext
-
Caches the specified table in-memory.
- calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.regression.AFTCostFun
-
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
Developer API
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
Developer API
variance calculation
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
Developer API
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
Developer API
variance calculation
- calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
Developer API
information calculation for multiclass classification
- calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
Developer API
information calculation for regression
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
Developer API
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
Developer API
variance calculation
- calculateNumberOfPartitions(long, int, int) - Method in class org.apache.spark.ml.feature.Word2VecModel.Word2VecModelWriter$
-
Calculate the number of partitions to use in saving the model.
- CalendarIntervalType - Class in org.apache.spark.sql.types
-
The data type representing calendar time intervals.
- CalendarIntervalType() - Constructor for class org.apache.spark.sql.types.CalendarIntervalType
-
- CalendarIntervalType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the CalendarIntervalType object.
- call(K, Iterator<V1>, Iterator<V2>) - Method in interface org.apache.spark.api.java.function.CoGroupFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.FilterFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
-
- call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.FlatMapGroupsFunction
-
- call(K, Iterator<V>, GroupState<S>) - Method in interface org.apache.spark.api.java.function.FlatMapGroupsWithStateFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.ForeachFunction
-
- call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.ForeachPartitionFunction
-
- call(T1) - Method in interface org.apache.spark.api.java.function.Function
-
- call() - Method in interface org.apache.spark.api.java.function.Function0
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
-
- call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
-
- call(T1, T2, T3, T4) - Method in interface org.apache.spark.api.java.function.Function4
-
- call(T) - Method in interface org.apache.spark.api.java.function.MapFunction
-
- call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.MapGroupsFunction
-
- call(K, Iterator<V>, GroupState<S>) - Method in interface org.apache.spark.api.java.function.MapGroupsWithStateFunction
-
- call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.MapPartitionsFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
-
- call(T, T) - Method in interface org.apache.spark.api.java.function.ReduceFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.VoidFunction2
-
- call() - Method in interface org.apache.spark.sql.api.java.UDF0
-
- call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
-
- call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
-
- call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
-
- call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
-
- call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
-
- call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
-
- call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
-
- call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
-
- callSite() - Method in class org.apache.spark.storage.RDDInfo
-
- callUDF(String, Column...) - Static method in class org.apache.spark.sql.functions
-
Call an user-defined function.
- callUDF(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Call an user-defined function.
- cancel() - Method in class org.apache.spark.ComplexFutureAction
-
- cancel() - Method in interface org.apache.spark.FutureAction
-
Cancels the execution of this action.
- cancel() - Method in class org.apache.spark.SimpleFutureAction
-
- cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelAllJobs() - Method in class org.apache.spark.SparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelJob(int, String) - Method in class org.apache.spark.SparkContext
-
Cancel a given job if it's scheduled or running.
- cancelJob(int) - Method in class org.apache.spark.SparkContext
-
Cancel a given job if it's scheduled or running.
- cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel active jobs for the specified group.
- cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
-
Cancel active jobs for the specified group.
- cancelStage(int, String) - Method in class org.apache.spark.SparkContext
-
Cancel a given stage and all jobs associated with it.
- cancelStage(int) - Method in class org.apache.spark.SparkContext
-
Cancel a given stage and all jobs associated with it.
- cancelTasks(int, boolean) - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- canCreate(String) - Method in interface org.apache.spark.scheduler.ExternalClusterManager
-
Check if this cluster manager instance can create scheduler components
for a certain master URL.
- canDoMerge() - Method in class org.apache.spark.sql.hive.HiveUDAFBuffer
-
- canEqual(Object) - Static method in class org.apache.spark.ExpireDeadHosts
-
- canEqual(Object) - Static method in class org.apache.spark.ml.feature.Dot
-
- canEqual(Object) - Static method in class org.apache.spark.Resubmitted
-
- canEqual(Object) - Static method in class org.apache.spark.rpc.netty.OnStart
-
- canEqual(Object) - Static method in class org.apache.spark.rpc.netty.OnStop
-
- canEqual(Object) - Static method in class org.apache.spark.scheduler.AllJobsCancelled
-
- canEqual(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-
- canEqual(Object) - Static method in class org.apache.spark.scheduler.JobSucceeded
-
- canEqual(Object) - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
-
- canEqual(Object) - Static method in class org.apache.spark.scheduler.StopCoordinator
-
- canEqual(Object) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- canEqual(Object) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- canEqual(Object) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.BinaryType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.BooleanType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.ByteType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.DateType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.DoubleType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.FloatType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.IntegerType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.LongType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.NullType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.ShortType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.StringType
-
- canEqual(Object) - Static method in class org.apache.spark.sql.types.TimestampType
-
- canEqual(Object) - Static method in class org.apache.spark.StopMapOutputTracker
-
- canEqual(Object) - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
-
- canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
-
- canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
-
- canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
-
- canEqual(Object) - Static method in class org.apache.spark.Success
-
- canEqual(Object) - Static method in class org.apache.spark.TaskResultLost
-
- canEqual(Object) - Static method in class org.apache.spark.TaskSchedulerIsSet
-
- canEqual(Object) - Static method in class org.apache.spark.UnknownReason
-
- canEqual(Object) - Method in class org.apache.spark.util.MutablePair
-
- canHandle(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- canHandle(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
Check if this dialect instance can handle a certain jdbc url.
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-
- canWrite(DataType, DataType, Function2<String, String, Object>, String, Function1<String, BoxedUnit>) - Static method in class org.apache.spark.sql.types.DataType
-
Returns true if the write data type can be read using the read data type.
- cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- caseSensitive() - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
Whether to do a case sensitive comparison over the stop words.
- cast(DataType) - Method in class org.apache.spark.sql.Column
-
Casts the column to a different data type.
- cast(String) - Method in class org.apache.spark.sql.Column
-
Casts the column to a different data type, using the canonical string representation
of the type.
- Catalog - Class in org.apache.spark.sql.catalog
-
Catalog interface for Spark.
- Catalog() - Constructor for class org.apache.spark.sql.catalog.Catalog
-
- catalog() - Method in class org.apache.spark.sql.SparkSession
-
Interface through which the user may create, drop, alter or query underlying
databases, tables, functions etc.
- catalogString() - Method in class org.apache.spark.sql.types.ArrayType
-
- catalogString() - Static method in class org.apache.spark.sql.types.BinaryType
-
- catalogString() - Static method in class org.apache.spark.sql.types.BooleanType
-
- catalogString() - Static method in class org.apache.spark.sql.types.ByteType
-
- catalogString() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- catalogString() - Method in class org.apache.spark.sql.types.DataType
-
String representation for the type saved in external catalogs.
- catalogString() - Static method in class org.apache.spark.sql.types.DateType
-
- catalogString() - Static method in class org.apache.spark.sql.types.DoubleType
-
- catalogString() - Static method in class org.apache.spark.sql.types.FloatType
-
- catalogString() - Static method in class org.apache.spark.sql.types.IntegerType
-
- catalogString() - Static method in class org.apache.spark.sql.types.LongType
-
- catalogString() - Method in class org.apache.spark.sql.types.MapType
-
- catalogString() - Static method in class org.apache.spark.sql.types.NullType
-
- catalogString() - Static method in class org.apache.spark.sql.types.ShortType
-
- catalogString() - Static method in class org.apache.spark.sql.types.StringType
-
- catalogString() - Method in class org.apache.spark.sql.types.StructType
-
- catalogString() - Static method in class org.apache.spark.sql.types.TimestampType
-
- CatalystScan - Interface in org.apache.spark.sql.sources
-
::Experimental::
An interface for experimenting with a more direct connection to the query planner.
- Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- categoricalCols() - Method in class org.apache.spark.ml.feature.FeatureHasher
-
Numeric columns to treat as categorical features.
- categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- CategoricalSplit - Class in org.apache.spark.ml.tree
-
Split which tests a categorical feature.
- categories() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- categories() - Method in class org.apache.spark.mllib.tree.model.Split
-
- categoryMaps() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- categorySizes() - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- cause() - Method in exception org.apache.spark.sql.AnalysisException
-
- cause() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
-
- CausedBy - Class in org.apache.spark.util
-
Extractor Object for pulling out the root cause of an error.
- CausedBy() - Constructor for class org.apache.spark.util.CausedBy
-
- cbrt(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the cube-root of the given value.
- cbrt(String) - Static method in class org.apache.spark.sql.functions
-
Computes the cube-root of the given column.
- ceil(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the ceiling of the given value.
- ceil(String) - Static method in class org.apache.spark.sql.functions
-
Computes the ceiling of the given column.
- ceil() - Method in class org.apache.spark.sql.types.Decimal
-
- censorCol() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams
-
Param for censor column name.
- chainl1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Function2<T, T, T>>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- chainl1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<U>>, Function0<Parsers.Parser<Function2<T, U, T>>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- chainr1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Function2<T, U, U>>>, Function2<T, U, U>, U) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- changePrecision(int, int) - Method in class org.apache.spark.sql.types.Decimal
-
Update precision and scale while keeping our value the same, and return true if successful.
- channelRead0(ChannelHandlerContext, byte[]) - Method in class org.apache.spark.api.r.RBackendAuthHandler
-
- CharType - Class in org.apache.spark.sql.types
-
Hive char type.
- CharType(int) - Constructor for class org.apache.spark.sql.types.CharType
-
- checkAndGetK8sMasterUrl(String) - Static method in class org.apache.spark.util.Utils
-
Check the validity of the given Kubernetes master URL and return the resolved URL.
- checkColumnNameDuplication(Seq<String>, String, Function2<String, String, Object>) - Static method in class org.apache.spark.sql.util.SchemaUtils
-
Checks if input column names have duplicate identifiers.
- checkColumnNameDuplication(Seq<String>, String, boolean) - Static method in class org.apache.spark.sql.util.SchemaUtils
-
Checks if input column names have duplicate identifiers.
- checkColumnType(StructType, String, DataType, String) - Static method in class org.apache.spark.ml.util.SchemaUtils
-
Check whether the given schema contains a column of the required data type.
- checkColumnTypes(StructType, String, Seq<DataType>, String) - Static method in class org.apache.spark.ml.util.SchemaUtils
-
Check whether the given schema contains a column of one of the require data types.
- checkDataColumns(RFormula, Dataset<?>) - Static method in class org.apache.spark.ml.r.RWrapperUtils
-
DataFrame column check.
- checkedCast() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams
-
Attempts to safely cast a user/item id to an Int.
- checkFileExists(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
Check if the file exists at the given path.
- checkHost(String) - Static method in class org.apache.spark.util.Utils
-
- checkHostPort(String) - Static method in class org.apache.spark.util.Utils
-
- checkNumericType(StructType, String, String) - Static method in class org.apache.spark.ml.util.SchemaUtils
-
Check whether the given schema contains a column of the numeric data type.
- checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Mark this RDD for checkpointing.
- checkpoint() - Method in class org.apache.spark.graphx.Graph
-
Mark this Graph for checkpointing.
- checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
-
- checkpoint() - Method in class org.apache.spark.rdd.RDD
-
Mark this RDD for checkpointing.
- checkpoint() - Method in class org.apache.spark.sql.Dataset
-
Eagerly checkpoint a Dataset and return the new Dataset.
- checkpoint(boolean) - Method in class org.apache.spark.sql.Dataset
-
Returns a checkpointed version of this Dataset.
- checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Enable periodic checkpointing of RDDs of this DStream.
- checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Sets the context to periodically checkpoint the DStream operations for master
fault-tolerance.
- checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Enable periodic checkpointing of RDDs of this DStream
- checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
-
Set the context to periodically checkpoint the DStream operations for driver
fault-tolerance.
- checkpointCleaned(long) - Method in interface org.apache.spark.CleanerListener
-
- Checkpointed() - Static method in class org.apache.spark.rdd.CheckpointState
-
- CheckpointingInProgress() - Static method in class org.apache.spark.rdd.CheckpointState
-
- checkpointInterval() - Method in interface org.apache.spark.ml.param.shared.HasCheckpointInterval
-
Param for set checkpoint interval (>= 1) or disable checkpoint (-1).
- checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- CheckpointReader - Class in org.apache.spark.streaming
-
- CheckpointReader() - Constructor for class org.apache.spark.streaming.CheckpointReader
-
- CheckpointState - Class in org.apache.spark.rdd
-
Enumeration to manage state transitions of an RDD through checkpointing
- CheckpointState() - Constructor for class org.apache.spark.rdd.CheckpointState
-
- checkSchemaColumnNameDuplication(StructType, String, boolean) - Static method in class org.apache.spark.sql.util.SchemaUtils
-
Checks if an input schema has duplicate column names.
- checkSingleVsMultiColumnParams(Params, Seq<Param<?>>, Seq<Param<?>>) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Utility for Param validity checks for Transformers which have both single- and multi-column
support.
- checkSpeculatableTasks(int) - Method in interface org.apache.spark.scheduler.Schedulable
-
- checkState(boolean, Function0<String>) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
- checkThresholdConsistency() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
If threshold
and thresholds
are both set, ensures they are consistent.
- child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
-
- child() - Method in class org.apache.spark.sql.sources.Not
-
- CHILD_CONNECTION_TIMEOUT - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Maximum time (in ms) to wait for a child process to connect back to the launcher server
when using @link{#start()}.
- CHILD_PROCESS_LOGGER_NAME - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Logger name to use when launching a child process.
- ChildFirstURLClassLoader - Class in org.apache.spark.util
-
A mutable class loader that gives preference to its own URLs over the parent class loader
when loading classes and resources.
- ChildFirstURLClassLoader(URL[], ClassLoader) - Constructor for class org.apache.spark.util.ChildFirstURLClassLoader
-
- chiSqFunc() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
-
- ChiSqSelector - Class in org.apache.spark.ml.feature
-
Chi-Squared feature selection, which selects categorical features to use for predicting a
categorical label.
- ChiSqSelector(String) - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
-
- ChiSqSelector() - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
-
- ChiSqSelector - Class in org.apache.spark.mllib.feature
-
Creates a ChiSquared feature selector.
- ChiSqSelector() - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
-
- ChiSqSelector(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
-
The is the same to call this() and setNumTopFeatures(numTopFeatures)
- ChiSqSelectorModel - Class in org.apache.spark.ml.feature
-
- ChiSqSelectorModel - Class in org.apache.spark.mllib.feature
-
Chi Squared selector model.
- ChiSqSelectorModel(int[]) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
- ChiSqSelectorModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.feature
-
- ChiSqSelectorModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.feature
-
Model data for import/export
- ChiSqSelectorModel.SaveLoadV1_0$.Data$ - Class in org.apache.spark.mllib.feature
-
- ChiSqSelectorParams - Interface in org.apache.spark.ml.feature
-
- chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's chi-squared goodness of fit test of the observed data against the
expected distribution.
- chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform
distribution, with each category having an expected frequency of 1 / observed.size
.
- chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's independence test on the input contingency matrix, which cannot contain
negative entries or columns or rows that sum up to 0.
- chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's independence test for every feature against the label across the input RDD.
- chiSqTest(JavaRDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Java-friendly version of chiSqTest()
- ChiSqTest - Class in org.apache.spark.mllib.stat.test
-
Conduct the chi-squared test for the input RDDs using the specified method.
- ChiSqTest() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest
-
- ChiSqTest.Method - Class in org.apache.spark.mllib.stat.test
-
param: name String name for the method.
- ChiSqTest.Method$ - Class in org.apache.spark.mllib.stat.test
-
- ChiSqTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
-
- ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
-
Object containing the test results for the chi-squared hypothesis test.
- chiSquared(Vector, Vector, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
- chiSquaredFeatures(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
Conduct Pearson's independence test for each feature against the label across the input RDD.
- chiSquaredMatrix(Matrix, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
- ChiSquareTest - Class in org.apache.spark.ml.stat
-
Experimental
- ChiSquareTest() - Constructor for class org.apache.spark.ml.stat.ChiSquareTest
-
- chmod700(File) - Static method in class org.apache.spark.util.Utils
-
JDK equivalent of chmod 700 file
.
- CholeskyDecomposition - Class in org.apache.spark.mllib.linalg
-
Compute Cholesky decomposition.
- CholeskyDecomposition() - Constructor for class org.apache.spark.mllib.linalg.CholeskyDecomposition
-
- cipherStream() - Method in interface org.apache.spark.security.CryptoStreamUtils.BaseErrorHandler
-
The encrypted stream that may get into an unhealthy state.
- classForName(String) - Static method in class org.apache.spark.util.Utils
-
Preferred alternative to Class.forName(className)
- Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- ClassificationLoss - Interface in org.apache.spark.mllib.tree.loss
-
- ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
Developer API
- ClassificationModel() - Constructor for class org.apache.spark.ml.classification.ClassificationModel
-
- ClassificationModel - Interface in org.apache.spark.mllib.classification
-
Represents a classification model that predicts to which of a set of categories an example
belongs.
- Classifier<FeaturesType,E extends Classifier<FeaturesType,E,M>,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
Developer API
- Classifier() - Constructor for class org.apache.spark.ml.classification.Classifier
-
- classifier() - Method in interface org.apache.spark.ml.classification.OneVsRestParams
-
param for the base binary classifier that we reduce multiclass classification into.
- ClassifierParams - Interface in org.apache.spark.ml.classification
-
(private[spark]) Params for classification.
- ClassifierTypeTrait - Interface in org.apache.spark.ml.classification
-
- classIsLoadable(String) - Static method in class org.apache.spark.util.Utils
-
Determines whether the provided class is loadable in the current thread.
- className() - Method in class org.apache.spark.ExceptionFailure
-
- className() - Static method in class org.apache.spark.ml.linalg.JsonMatrixConverter
-
Unique class name for identifying JSON object encoded by this class.
- className() - Method in class org.apache.spark.sql.catalog.Function
-
- classpathEntries() - Method in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo
-
- classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaRDD
-
- classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- classTag() - Method in class org.apache.spark.sql.Dataset
-
- classTag() - Method in class org.apache.spark.storage.memory.DeserializedMemoryEntry
-
- classTag() - Method in interface org.apache.spark.storage.memory.MemoryEntry
-
- classTag() - Method in class org.apache.spark.storage.memory.SerializedMemoryEntry
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
- classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- clean(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLog
-
Clean all the records that are older than the threshold time.
- clean(Object, boolean, boolean) - Static method in class org.apache.spark.util.ClosureCleaner
-
Clean the given closure in place.
- CleanAccum - Class in org.apache.spark
-
- CleanAccum(long) - Constructor for class org.apache.spark.CleanAccum
-
- CleanBroadcast - Class in org.apache.spark
-
- CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
-
- CleanCheckpoint - Class in org.apache.spark
-
- CleanCheckpoint(int) - Constructor for class org.apache.spark.CleanCheckpoint
-
- CleanerListener - Interface in org.apache.spark
-
Listener class used for testing when any item has been cleaned by the Cleaner class.
- cleaning() - Method in class org.apache.spark.status.LiveStage
-
- CleanRDD - Class in org.apache.spark
-
- CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
-
- CleanShuffle - Class in org.apache.spark
-
- CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
-
- cleanupOldBlocks(long) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
-
Cleanup old blocks older than the given threshold time
- CleanupTask - Interface in org.apache.spark
-
Classes that represent cleaning tasks.
- CleanupTaskWeakReference - Class in org.apache.spark
-
A WeakReference associated with a CleanupTask.
- CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
-
- clear(Param<?>) - Method in interface org.apache.spark.ml.param.Params
-
Clears the user-supplied value for the input param.
- clear() - Method in class org.apache.spark.sql.util.ExecutionListenerManager
-
- clear() - Static method in class org.apache.spark.util.AccumulatorContext
-
- clearActive() - Static method in class org.apache.spark.sql.SQLContext
-
- clearActiveSession() - Static method in class org.apache.spark.sql.SparkSession
-
Clears the active SparkSession for current thread.
- clearCache() - Method in class org.apache.spark.sql.catalog.Catalog
-
Removes all cached tables from the in-memory cache.
- clearCache() - Method in class org.apache.spark.sql.SQLContext
-
Removes all cached tables from the in-memory cache.
- clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Pass-through to SparkContext.setCallSite.
- clearCallSite() - Method in class org.apache.spark.SparkContext
-
Clear the thread-local property for overriding the call sites
of actions and RDDs.
- clearDefaultSession() - Static method in class org.apache.spark.sql.SparkSession
-
Clears the default SparkSession that is returned by the builder.
- clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-
- clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the current thread's job group ID and its description.
- clearJobGroup() - Method in class org.apache.spark.SparkContext
-
Clear the current thread's job group ID and its description.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
Clears the threshold so that predict
will output raw prediction scores.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
-
Clears the threshold so that predict
will output raw prediction scores.
- Clock - Interface in org.apache.spark.util
-
An interface to represent clocks, so that they can be mocked out in unit tests.
- CLogLog$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$
-
- clone() - Method in class org.apache.spark.SparkConf
-
Copy this object
- clone() - Method in class org.apache.spark.sql.ExperimentalMethods
-
- clone() - Method in class org.apache.spark.sql.types.Decimal
-
- clone() - Method in class org.apache.spark.sql.util.ExecutionListenerManager
-
Get an identical copy of this listener manager.
- clone() - Method in class org.apache.spark.storage.StorageLevel
-
- clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- clone() - Method in class org.apache.spark.util.random.BernoulliSampler
-
- clone() - Method in class org.apache.spark.util.random.PoissonSampler
-
- clone() - Method in interface org.apache.spark.util.random.RandomSampler
-
return a copy of the RandomSampler object
- clone(T, SerializerInstance, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
-
Clone an object using a Spark serializer.
- cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
Return a sampler that is the complement of the range specified of the current sampler.
- cloneProperties(Properties) - Static method in class org.apache.spark.util.Utils
-
Create a new properties object with the same values as `props`
- close() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- close() - Method in class org.apache.spark.io.NioBufferedFileInputStream
-
- close() - Method in class org.apache.spark.io.ReadAheadInputStream
-
- close() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
-
- close() - Method in interface org.apache.spark.security.CryptoStreamUtils.BaseErrorHandler
-
- close() - Method in class org.apache.spark.serializer.DeserializationStream
-
- close() - Method in class org.apache.spark.serializer.SerializationStream
-
- close(Throwable) - Method in class org.apache.spark.sql.ForeachWriter
-
Called when stopping to process one partition of new data in the executor side.
- close() - Method in class org.apache.spark.sql.hive.execution.HiveOutputWriter
-
- close() - Method in class org.apache.spark.sql.SparkSession
-
Synonym for stop()
.
- close() - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- close() - Method in class org.apache.spark.sql.vectorized.ColumnarBatch
-
Called to close all the columns in this batch.
- close() - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Cleans up memory for this column vector.
- close() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-
- close() - Method in class org.apache.spark.storage.CountingWritableChannel
-
- close() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
-
- close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- close() - Method in class org.apache.spark.streaming.util.WriteAheadLog
-
Close this log and release any resources.
- closed() - Method in interface org.apache.spark.security.CryptoStreamUtils.BaseErrorHandler
-
- closeWriter(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
-
- ClosureCleaner - Class in org.apache.spark.util
-
A cleaner that renders closures serializable if they can be done so safely.
- ClosureCleaner() - Constructor for class org.apache.spark.util.ClosureCleaner
-
- closureSerializer() - Method in class org.apache.spark.SparkEnv
-
- cls() - Method in class org.apache.spark.sql.types.ObjectType
-
- cls() - Method in class org.apache.spark.util.MethodIdentifier
-
- clsTag() - Method in interface org.apache.spark.sql.Encoder
-
A ClassTag that can be used to construct an Array to contain a collection of T
.
- cluster() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
-
Cluster centers of the transformed data.
- cluster() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-
- clusterCenter() - Method in class org.apache.spark.ml.clustering.ClusterData
-
- clusterCenters() - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
- clusterCenters() - Method in class org.apache.spark.ml.clustering.KMeansModel
-
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
Leaf cluster centers.
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- ClusterData - Class in org.apache.spark.ml.clustering
-
Helper class for storing model data
- ClusterData(int, Vector) - Constructor for class org.apache.spark.ml.clustering.ClusterData
-
- clusteredColumns - Variable in class org.apache.spark.sql.sources.v2.reader.partitioning.ClusteredDistribution
-
The names of the clustered columns.
- ClusteredDistribution - Class in org.apache.spark.sql.sources.v2.reader.partitioning
-
- ClusteredDistribution(String[]) - Constructor for class org.apache.spark.sql.sources.v2.reader.partitioning.ClusteredDistribution
-
- clusterIdx() - Method in class org.apache.spark.ml.clustering.ClusterData
-
- ClusteringEvaluator - Class in org.apache.spark.ml.evaluation
-
Experimental
- ClusteringEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- ClusteringEvaluator() - Constructor for class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- ClusteringSummary - Class in org.apache.spark.ml.clustering
-
Experimental
Summary of clustering algorithms.
- clusterSizes() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
-
Size of (number of data points in) each cluster.
- ClusterStats(Vector, double, long) - Constructor for class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats
-
- ClusterStats$() - Constructor for class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats$
-
- clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- cn() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, RDD<?>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
Runs the packing algorithm and returns an array of PartitionGroups that if possible are
load balanced and grouped by locality
- coalesce(int, RDD<?>) - Method in interface org.apache.spark.rdd.PartitionCoalescer
-
Coalesce the partitions of the given RDD.
- coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset that has exactly numPartitions
partitions, when the fewer partitions
are requested.
- coalesce(Column...) - Static method in class org.apache.spark.sql.functions
-
Returns the first column that is not null, or null if all inputs are null.
- coalesce(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Returns the first column that is not null, or null if all inputs are null.
- CoarseGrainedClusterMessage - Interface in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages
-
- CoarseGrainedClusterMessages.AddWebUIFilter - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.AddWebUIFilter$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.GetExecutorLossReason - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.GetExecutorLossReason$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillExecutors - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillExecutors$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillExecutorsOnHost - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillExecutorsOnHost$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillTask - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillTask$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.LaunchTask - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.LaunchTask$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterClusterManager - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterClusterManager$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisteredExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutor - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutorResponse - Interface in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RemoveExecutor - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RemoveExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RemoveWorker - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RemoveWorker$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RequestExecutors - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RequestExecutors$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RetrieveLastAllocatedExecutorId$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RetrieveSparkAppConfig$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.ReviveOffers$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.SetupDriver - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.SetupDriver$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.Shutdown$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.SparkAppConfig - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.SparkAppConfig$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StatusUpdate - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StatusUpdate$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StopDriver$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StopExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StopExecutors$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.UpdateDelegationTokens - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.UpdateDelegationTokens$ - Class in org.apache.spark.scheduler.cluster
-
- code() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- CodegenMetrics - Class in org.apache.spark.metrics.source
-
Experimental
Metrics for code generation.
- CodegenMetrics() - Constructor for class org.apache.spark.metrics.source.CodegenMetrics
-
- codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- coefficientMatrix() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- coefficients() - Method in class org.apache.spark.ml.classification.LinearSVCModel
-
- coefficients() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
A vector of model coefficients for "binomial" logistic regression.
- coefficients() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- coefficients() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
-
- coefficients() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- coefficientStandardErrors() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary
-
Standard error of estimated coefficients and intercept.
- coefficientStandardErrors() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
Standard error of estimated coefficients and intercept.
- cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(KeyValueGroupedDataset<K, U>, Function3<K, Iterator<V>, Iterator<U>, TraversableOnce<R>>, Encoder<R>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
(Scala-specific)
Applies the given function to each cogrouped data.
- cogroup(KeyValueGroupedDataset<K, U>, CoGroupFunction<K, V, U, R>, Encoder<R>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
(Java-specific)
Applies the given function to each cogrouped data.
- cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- CoGroupedRDD<K> - Class in org.apache.spark.rdd
-
Developer API
An RDD that cogroups its parents.
- CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner, ClassTag<K>) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
-
- CoGroupFunction<K,V1,V2,R> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more output records from each grouping key and its values from 2
Datasets.
- col(String) - Method in class org.apache.spark.sql.Dataset
-
Selects column based on the column name and returns it as a
Column
.
- col(String) - Static method in class org.apache.spark.sql.functions
-
Returns a
Column
based on the given column name.
- coldStartStrategy() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams
-
Param for strategy for dealing with unknown or new users/items at prediction time.
- colIter() - Method in class org.apache.spark.ml.linalg.DenseMatrix
-
- colIter() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Returns an iterator of column vectors.
- colIter() - Method in class org.apache.spark.ml.linalg.SparseMatrix
-
- colIter() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- colIter() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Returns an iterator of column vectors.
- colIter() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in this RDD.
- collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- collect() - Method in class org.apache.spark.rdd.RDD
-
Return an array that contains all of the elements in this RDD.
- collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD that contains all matching values by applying f
.
- collect() - Method in class org.apache.spark.sql.Dataset
-
Returns an array that contains all rows in this Dataset.
- collect_list(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns a list of objects with duplicates.
- collect_list(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns a list of objects with duplicates.
- collect_set(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns a set of objects with duplicate elements eliminated.
- collect_set(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns a set of objects with duplicate elements eliminated.
- collectAsList() - Method in class org.apache.spark.sql.Dataset
-
Returns a Java list that contains all rows in this Dataset.
- collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of collect
, which returns a future for
retrieving an array containing all of the elements in this RDD.
- collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for retrieving all elements of this RDD.
- collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Returns an RDD that contains for each vertex v its local edges,
i.e., the edges that are incident on v, in the user-specified direction.
- collectionAccumulator() - Method in class org.apache.spark.SparkContext
-
Create and register a CollectionAccumulator
, which starts with empty list and accumulates
inputs by adding them into the list.
- collectionAccumulator(String) - Method in class org.apache.spark.SparkContext
-
Create and register a CollectionAccumulator
, which starts with empty list and accumulates
inputs by adding them into the list.
- CollectionAccumulator<T> - Class in org.apache.spark.util
-
- CollectionAccumulator() - Constructor for class org.apache.spark.util.CollectionAccumulator
-
- CollectionsUtils - Class in org.apache.spark.util
-
- CollectionsUtils() - Constructor for class org.apache.spark.util.CollectionsUtils
-
- collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Collect the neighbor vertex ids for each vertex.
- collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Collect the neighbor vertex attributes for each vertex.
- collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in a specific partition of this RDD.
- collectSubModels() - Method in interface org.apache.spark.ml.param.shared.HasCollectSubModels
-
Param for whether to collect a list of sub-models trained during tuning.
- colPtrs() - Method in class org.apache.spark.ml.linalg.SparseMatrix
-
- colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- colRegex(String) - Method in class org.apache.spark.sql.Dataset
-
Selects column based on the column name specified as a regex and returns it as
Column
.
- colsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Computes column-wise summary statistics for the input RDD[Vector].
- Column - Class in org.apache.spark.sql.catalog
-
A column in Spark, as returned by
listColumns
method in
Catalog
.
- Column(String, String, String, boolean, boolean, boolean) - Constructor for class org.apache.spark.sql.catalog.Column
-
- Column - Class in org.apache.spark.sql
-
A column that will be computed based on the data in a DataFrame
.
- Column(Expression) - Constructor for class org.apache.spark.sql.Column
-
- Column(String) - Constructor for class org.apache.spark.sql.Column
-
- column(String) - Static method in class org.apache.spark.sql.functions
-
Returns a
Column
based on the given column name.
- column(int) - Method in class org.apache.spark.sql.vectorized.ColumnarBatch
-
Returns the column at `ordinal`.
- ColumnarArray - Class in org.apache.spark.sql.vectorized
-
- ColumnarArray(ColumnVector, int, int) - Constructor for class org.apache.spark.sql.vectorized.ColumnarArray
-
- ColumnarBatch - Class in org.apache.spark.sql.vectorized
-
This class wraps multiple ColumnVectors as a row-wise table.
- ColumnarBatch(ColumnVector[]) - Constructor for class org.apache.spark.sql.vectorized.ColumnarBatch
-
- ColumnarMap - Class in org.apache.spark.sql.vectorized
-
- ColumnarMap(ColumnVector, ColumnVector, int, int) - Constructor for class org.apache.spark.sql.vectorized.ColumnarMap
-
- ColumnarRow - Class in org.apache.spark.sql.vectorized
-
- ColumnarRow(ColumnVector, int) - Constructor for class org.apache.spark.sql.vectorized.ColumnarRow
-
- ColumnName - Class in org.apache.spark.sql
-
A convenient class used for constructing schema.
- ColumnName(String) - Constructor for class org.apache.spark.sql.ColumnName
-
- ColumnPruner - Class in org.apache.spark.ml.feature
-
Utility transformer for removing temporary columns from a DataFrame.
- ColumnPruner(String, Set<String>) - Constructor for class org.apache.spark.ml.feature.ColumnPruner
-
- ColumnPruner(Set<String>) - Constructor for class org.apache.spark.ml.feature.ColumnPruner
-
- columns() - Method in class org.apache.spark.sql.Dataset
-
Returns all column names as an array.
- columnSchema() - Static method in class org.apache.spark.ml.image.ImageSchema
-
Schema for the image column: Row(String, Int, Int, Int, Int, Array[Byte])
- columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Compute all cosine similarities between columns of this matrix using the brute-force
approach of computing normalized dot products.
- columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Compute all cosine similarities between columns of this matrix using the brute-force
approach of computing normalized dot products.
- columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Compute similarities between columns of this matrix using a sampling approach.
- columnsToPrune() - Method in class org.apache.spark.ml.feature.ColumnPruner
-
- columnToOldVector(Dataset<?>, String) - Static method in class org.apache.spark.ml.util.DatasetUtils
-
- columnToVector(Dataset<?>, String) - Static method in class org.apache.spark.ml.util.DatasetUtils
-
Cast a column in a Dataset to Vector type.
- ColumnVector - Class in org.apache.spark.sql.vectorized
-
An interface representing in-memory columnar data in Spark.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the output RDD and uses map-side
aggregation.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing
partitioner/parallelism level and using map-side aggregation.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Simplified version of combineByKeyWithClassTag that hash-partitions the resulting RDD using the
existing partitioner/parallelism level.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Combine elements of each key in DStream's RDDs using custom functions.
- combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Experimental
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Experimental
Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD.
- combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Experimental
Simplified version of combineByKeyWithClassTag that hash-partitions the resulting RDD using the
existing partitioner/parallelism level.
- combineCombinersByKey(Iterator<? extends Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- combineValuesByKey(Iterator<? extends Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- CommandLineUtils - Interface in org.apache.spark.util
-
Contains basic command line parsing functionality and methods to parse some common Spark CLI
options.
- commit(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- commit(Offset) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader
-
Informs the source that Spark has completed processing all data for offsets less than or
equal to `end` and will only request offsets greater than `end` in the future.
- commit(Offset) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader
-
Informs the source that Spark has completed processing all data for offsets less than or
equal to `end` and will only request offsets greater than `end` in the future.
- commit(WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.DataSourceWriter
-
Commits this writing job with a list of commit messages.
- commit() - Method in interface org.apache.spark.sql.sources.v2.writer.DataWriter
-
- commit(long, WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.streaming.StreamWriter
-
Commits this writing job for the specified epoch with a list of commit messages.
- commit(WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.streaming.StreamWriter
-
- commitJob(JobContext, Seq<FileCommitProtocol.TaskCommitMessage>) - Method in class org.apache.spark.internal.io.FileCommitProtocol
-
Commits a job after the writes succeed.
- commitJob(JobContext, Seq<FileCommitProtocol.TaskCommitMessage>) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
-
- commitTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol
-
Commits a task after the writes succeed.
- commitTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
-
- commitTask(OutputCommitter, TaskAttemptContext, int, int) - Static method in class org.apache.spark.mapred.SparkHadoopMapRedUtil
-
Commits a task output.
- commonHeaderNodes(HttpServletRequest) - Static method in class org.apache.spark.ui.UIUtils
-
- comparator(Schedulable, Schedulable) - Method in interface org.apache.spark.scheduler.SchedulingAlgorithm
-
- compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- compare(Decimal) - Method in class org.apache.spark.sql.types.Decimal
-
- compare(Decimal, Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
-
- compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
-
- compareTo(SparkShutdownHook) - Method in class org.apache.spark.util.SparkShutdownHook
-
- compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- compileValue(Object) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
Converts value to SQL expression.
- compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- Complete() - Static method in class org.apache.spark.sql.streaming.OutputMode
-
OutputMode in which all the rows in the streaming DataFrame/Dataset will be written
to the sink every time there are some updates.
- completed() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- completedIndices() - Method in class org.apache.spark.status.LiveJob
-
- completedIndices() - Method in class org.apache.spark.status.LiveStage
-
- completedStages() - Method in class org.apache.spark.status.LiveJob
-
- completedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- completedTasks() - Method in class org.apache.spark.status.LiveExecutor
-
- completedTasks() - Method in class org.apache.spark.status.LiveJob
-
- completedTasks() - Method in class org.apache.spark.status.LiveStage
-
- COMPLETION_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
-
- completionTime() - Method in class org.apache.spark.scheduler.StageInfo
-
Time when all tasks in the stage completed or when the stage was cancelled.
- completionTime() - Method in class org.apache.spark.status.api.v1.JobData
-
- completionTime() - Method in class org.apache.spark.status.api.v1.StageData
-
- completionTime() - Method in class org.apache.spark.status.LiveJob
-
- ComplexFutureAction<T> - Class in org.apache.spark
-
A
FutureAction
for actions that could trigger multiple Spark jobs.
- ComplexFutureAction(Function1<JobSubmitter, Future<T>>) - Constructor for class org.apache.spark.ComplexFutureAction
-
- compressed() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Returns a matrix in dense column major, dense row major, sparse row major, or sparse column
major format, whichever uses less storage.
- compressed() - Method in interface org.apache.spark.ml.linalg.Vector
-
Returns a vector in either dense or sparse format, whichever uses less storage.
- compressed() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Returns a vector in either dense or sparse format, whichever uses less storage.
- compressedColMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Returns a matrix in dense or sparse column major format, whichever uses less storage.
- compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.ZStdCompressionCodec
-
- compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.ZStdCompressionCodec
-
- compressedRowMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Returns a matrix in dense or sparse row major format, whichever uses less storage.
- CompressionCodec - Interface in org.apache.spark.io
-
Developer API
CompressionCodec allows the customization of choosing different compression implementations
to be used in block storage.
- compute(Partition, TaskContext) - Method in class org.apache.spark.api.r.BaseRRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD
-
Provides the RDD[(VertexId, VD)]
equivalent output.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point.
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point,
add the gradient to a provided vector to avoid creating new objects, and return loss.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
-
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
-
Compute an updated value for weights given the gradient, stepSize, iteration number and
regularization parameter.
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
-
Developer API
Implemented by subclasses to compute a given partition.
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
-
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Generate an RDD for the given duration
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Method that generates an RDD for the given Duration
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Method that generates an RDD for the given time
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- compute(long, long, long, long) - Method in interface org.apache.spark.streaming.scheduler.rate.RateEstimator
-
Computes the number of records the stream attached to this RateEstimator
should ingest per second, given an update on the size and completion
times of the latest batch.
- computeClusterStats(Dataset<Row>, String, String) - Static method in class org.apache.spark.ml.evaluation.CosineSilhouette
-
The method takes the input dataset and computes the aggregated values
about a cluster which are needed by the algorithm.
- computeClusterStats(Dataset<Row>, String, String) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette
-
The method takes the input dataset and computes the aggregated values
about a cluster which are needed by the algorithm.
- computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes column-wise summary statistics.
- computeCorrelation(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
-
Compute correlation for two datasets.
- computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
Compute the Pearson correlation for two datasets.
- computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
-
Compute Spearman's correlation for two datasets.
- computeCorrelationMatrix(RDD<Vector>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
-
Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation
between column i and j.
- computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the
correlation between column i and j.
- computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
-
Compute Spearman's correlation matrix S, for the input matrix, where S(i, j) is the
correlation between column i and j.
- computeCorrelationMatrixFromCovariance(Matrix) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
Compute the Pearson correlation matrix from the covariance matrix.
- computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
-
Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the
correlation implementation for RDD[Vector].
- computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
- computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
-
- computeCost(Dataset<?>) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
Computes the sum of squared distances between the input points and their corresponding cluster
centers.
- computeCost(Dataset<?>) - Method in class org.apache.spark.ml.clustering.KMeansModel
-
- computeCost(Vector) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
Computes the squared distance between the input point and the cluster center it belongs to.
- computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
Computes the sum of squared distances between the input points and their corresponding cluster
centers.
- computeCost(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
Java-friendly version of computeCost()
.
- computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Return the K-means cost (sum of squared distances of points to their nearest center) for this
model on the given data.
- computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the covariance matrix, treating each row as an observation.
- computeError(RDD<LabeledPoint>, DecisionTreeRegressionModel[], double[], Loss) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees
-
Method to calculate error of the base learner for the gradient boosting calculation.
- computeError(org.apache.spark.mllib.tree.model.TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss
-
Method to calculate error of the base learner for the gradient boosting calculation.
- computeError(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss
-
Method to calculate loss when the predictions are already known.
- computeFractionForSampleSize(int, long, boolean) - Static method in class org.apache.spark.util.random.SamplingUtils
-
Returns a sampling rate that guarantees a sample of size greater than or equal to
sampleSizeLowerBound 99.99% of the time.
- computeGradient(DenseMatrix<Object>, DenseMatrix<Object>, Vector, int) - Method in interface org.apache.spark.ml.ann.TopologyModel
-
Computes gradient for the network
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Computes the Gramian matrix A^T A
.
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the Gramian matrix A^T A
.
- computeInitialPredictionAndError(RDD<LabeledPoint>, double, DecisionTreeRegressionModel, Loss) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees
-
Compute the initial predictions and errors for a dataset for the first
iteration of gradient boosting.
- computeInitialPredictionAndError(RDD<LabeledPoint>, double, DecisionTreeModel, Loss) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
Developer API
Compute the initial predictions and errors for a dataset for the first
iteration of gradient boosting.
- computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
-
Computes the preferred locations based on input(s) and returned a location to block map.
- computePrevDelta(DenseMatrix<Object>, DenseMatrix<Object>, DenseMatrix<Object>) - Method in interface org.apache.spark.ml.ann.LayerModel
-
Computes the delta for back propagation.
- computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the top k principal components only.
- computePrincipalComponentsAndExplainedVariance(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the top k principal components and a vector of proportions of
variance explained by each principal component.
- computeProbability(double) - Method in interface org.apache.spark.mllib.tree.loss.ClassificationLoss
-
Computes the class probability given the margin.
- computeSilhouetteCoefficient(Broadcast<Map<Object, Tuple2<Vector, Object>>>, Vector, double) - Static method in class org.apache.spark.ml.evaluation.CosineSilhouette
-
It computes the Silhouette coefficient for a point.
- computeSilhouetteCoefficient(Broadcast<Map<Object, SquaredEuclideanSilhouette.ClusterStats>>, Vector, double, double) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette
-
It computes the Silhouette coefficient for a point.
- computeSilhouetteScore(Dataset<?>, String, String) - Static method in class org.apache.spark.ml.evaluation.CosineSilhouette
-
Compute the Silhouette score of the dataset using the cosine distance measure.
- computeSilhouetteScore(Dataset<?>, String, String) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette
-
Compute the Silhouette score of the dataset using squared Euclidean distance measure.
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Computes the singular value decomposition of this IndexedRowMatrix.
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes singular value decomposition of this matrix.
- computeThresholdByKey(Map<K, AcceptanceResult>, Map<K, Object>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Given the result returned by getCounts, determine the threshold for accepting items to
generate exact sample size.
- concat(Column...) - Static method in class org.apache.spark.sql.functions
-
Concatenates multiple input columns together into a single column.
- concat(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Concatenates multiple input columns together into a single column.
- concat_ws(String, Column...) - Static method in class org.apache.spark.sql.functions
-
Concatenates multiple input string columns together into a single string column,
using the given separator.
- concat_ws(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Concatenates multiple input string columns together into a single string column,
using the given separator.
- Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- conf() - Method in interface org.apache.spark.input.Configurable
-
- conf() - Method in class org.apache.spark.SparkEnv
-
- conf() - Method in class org.apache.spark.sql.hive.RelationConversions
-
- conf() - Method in class org.apache.spark.sql.SparkSession
-
Runtime configuration interface for Spark.
- confidence() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
-
Returns the confidence of the rule.
- confidence() - Method in class org.apache.spark.partial.BoundedDouble
-
- confidence() - Method in class org.apache.spark.util.sketch.CountMinSketch
-
- config(String, String) - Method in class org.apache.spark.sql.SparkSession.Builder
-
Sets a config option.
- config(String, long) - Method in class org.apache.spark.sql.SparkSession.Builder
-
Sets a config option.
- config(String, double) - Method in class org.apache.spark.sql.SparkSession.Builder
-
Sets a config option.
- config(String, boolean) - Method in class org.apache.spark.sql.SparkSession.Builder
-
Sets a config option.
- config(SparkConf) - Method in class org.apache.spark.sql.SparkSession.Builder
-
Sets a list of config options based on the given SparkConf
.
- config - Class in org.apache.spark.status
-
- config() - Constructor for class org.apache.spark.status.config
-
- ConfigEntryWithDefault<T> - Class in org.apache.spark.internal.config
-
- ConfigEntryWithDefault(String, List<String>, T, Function1<String, T>, Function1<T, String>, String, boolean) - Constructor for class org.apache.spark.internal.config.ConfigEntryWithDefault
-
- ConfigEntryWithDefaultFunction<T> - Class in org.apache.spark.internal.config
-
- ConfigEntryWithDefaultFunction(String, List<String>, Function0<T>, Function1<String, T>, Function1<T, String>, String, boolean) - Constructor for class org.apache.spark.internal.config.ConfigEntryWithDefaultFunction
-
- ConfigEntryWithDefaultString<T> - Class in org.apache.spark.internal.config
-
- ConfigEntryWithDefaultString(String, List<String>, String, Function1<String, T>, Function1<T, String>, String, boolean) - Constructor for class org.apache.spark.internal.config.ConfigEntryWithDefaultString
-
- ConfigHelpers - Class in org.apache.spark.internal.config
-
- ConfigHelpers() - Constructor for class org.apache.spark.internal.config.ConfigHelpers
-
- ConfigProvider - Interface in org.apache.spark.internal.config
-
A source of configuration values.
- configTestLog4j(String) - Static method in class org.apache.spark.TestUtils
-
config a log4j properties used for testsuite
- Configurable - Interface in org.apache.spark.input
-
A trait to implement Configurable
interface.
- configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
-
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
- CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.NewHadoopRDD
-
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
- configureJobPropertiesForStorageHandler(TableDesc, Configuration, boolean) - Static method in class org.apache.spark.sql.hive.HiveTableUtil
-
- confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns confusion matrix:
predicted classes are in columns,
they are ordered by class label ascending,
as in "labels"
- connectedComponents() - Method in class org.apache.spark.graphx.GraphOps
-
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
- connectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps
-
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
- ConnectedComponents - Class in org.apache.spark.graphx.lib
-
Connected components algorithm.
- ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
-
- consequent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
-
- ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
An input stream that always returns the same RDD on each time step.
- ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- constructTree(org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData[]) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
Given a list of nodes from a tree, construct the tree.
- constructTrees(RDD<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- constructURIForAuthentication(URI, org.apache.spark.SecurityManager) - Static method in class org.apache.spark.util.Utils
-
Construct a URI container information used for authentication.
- contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap
-
Checks whether a parameter is explicitly specified.
- contains(String) - Method in class org.apache.spark.SparkConf
-
Does the configuration contain a given parameter?
- contains(Object) - Method in class org.apache.spark.sql.Column
-
Contains the other element.
- contains(String) - Method in class org.apache.spark.sql.types.Metadata
-
Tests whether this Metadata contains a binding for a key.
- containsDelimiters() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- containsKey(Object) - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
-
- containsNull() - Method in class org.apache.spark.sql.types.ArrayType
-
- contentType() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
-
- context() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- context() - Method in class org.apache.spark.InterruptibleIterator
-
- context(SQLContext) - Static method in class org.apache.spark.ml.r.RWrappers
-
- context(SQLContext) - Method in interface org.apache.spark.ml.util.BaseReadWrite
-
- context(SQLContext) - Method in class org.apache.spark.ml.util.GeneralMLWriter
-
- context(SQLContext) - Method in class org.apache.spark.ml.util.MLReader
-
- context(SQLContext) - Method in class org.apache.spark.ml.util.MLWriter
-
- context() - Method in class org.apache.spark.rdd.RDD
-
- context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- context() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return the StreamingContext associated with this DStream
- ContextBarrierId - Class in org.apache.spark
-
For each barrier stage attempt, only at most one barrier() call can be active at any time, thus
we can use (stageId, stageAttemptId) to identify the stage attempt where the barrier() call is
from.
- ContextBarrierId(int, int) - Constructor for class org.apache.spark.ContextBarrierId
-
- Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- Continuous(long) - Static method in class org.apache.spark.sql.streaming.Trigger
-
A trigger that continuously processes streaming data, asynchronously checkpointing at
the specified interval.
- Continuous(long, TimeUnit) - Static method in class org.apache.spark.sql.streaming.Trigger
-
A trigger that continuously processes streaming data, asynchronously checkpointing at
the specified interval.
- Continuous(Duration) - Static method in class org.apache.spark.sql.streaming.Trigger
-
(Scala-friendly)
A trigger that continuously processes streaming data, asynchronously checkpointing at
the specified interval.
- Continuous(String) - Static method in class org.apache.spark.sql.streaming.Trigger
-
A trigger that continuously processes streaming data, asynchronously checkpointing at
the specified interval.
- ContinuousInputPartition<T> - Interface in org.apache.spark.sql.sources.v2.reader
-
- ContinuousInputPartitionReader<T> - Interface in org.apache.spark.sql.sources.v2.reader.streaming
-
- ContinuousReader - Interface in org.apache.spark.sql.sources.v2.reader.streaming
-
- ContinuousReadSupport - Interface in org.apache.spark.sql.sources.v2
-
- ContinuousSplit - Class in org.apache.spark.ml.tree
-
Split which tests a continuous feature.
- conv(Column, int, int) - Static method in class org.apache.spark.sql.functions
-
Convert a number in a string column from one base to another.
- CONVERT_METASTORE_ORC() - Static method in class org.apache.spark.sql.hive.HiveUtils
-
- CONVERT_METASTORE_PARQUET() - Static method in class org.apache.spark.sql.hive.HiveUtils
-
- CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING() - Static method in class org.apache.spark.sql.hive.HiveUtils
-
- convertMatrixColumnsFromML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Converts matrix columns in an input Dataset to the
Matrix
type from the new
Matrix
type under the
spark.ml
package.
- convertMatrixColumnsFromML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Converts matrix columns in an input Dataset to the
Matrix
type from the new
Matrix
type under the
spark.ml
package.
- convertMatrixColumnsToML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Converts Matrix columns in an input Dataset from the
Matrix
type to the new
Matrix
type under the
spark.ml
package.
- convertMatrixColumnsToML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Converts Matrix columns in an input Dataset from the
Matrix
type to the new
Matrix
type under the
spark.ml
package.
- convertToCanonicalEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.GraphOps
-
Convert bi-directional edges into uni-directional ones.
- convertToOldLossType(String) - Method in interface org.apache.spark.ml.tree.GBTRegressorParams
-
- convertToTimeUnit(long, TimeUnit) - Static method in class org.apache.spark.streaming.ui.UIUtils
-
Convert milliseconds
to the specified unit
.
- convertVectorColumnsFromML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Converts vector columns in an input Dataset to the
Vector
type from the new
Vector
type under the
spark.ml
package.
- convertVectorColumnsFromML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Converts vector columns in an input Dataset to the
Vector
type from the new
Vector
type under the
spark.ml
package.
- convertVectorColumnsToML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Converts vector columns in an input Dataset from the
Vector
type to the new
Vector
type under the
spark.ml
package.
- convertVectorColumnsToML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Converts vector columns in an input Dataset from the
Vector
type to the new
Vector
type under the
spark.ml
package.
- CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
Represents a matrix in coordinate format.
- CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.LinearSVC
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.LinearSVCModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayes
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeans
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeansModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LDA
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LocalLDAModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
- copy(ParamMap) - Method in class org.apache.spark.ml.Estimator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.Evaluator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Binarizer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.ColumnPruner
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.HashingTF
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDF
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDFModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Imputer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.ImputerModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.IndexToString
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Interaction
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinHashLSH
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinHashLSHModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCA
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCAModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormula
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormulaModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.SQLTransformer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Tokenizer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorSizeHint
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorSlicer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.fpm.FPGrowth
-
- copy(ParamMap) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
- copy(Vector, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS
-
y = x
- copy() - Method in class org.apache.spark.ml.linalg.DenseMatrix
-
- copy() - Method in class org.apache.spark.ml.linalg.DenseVector
-
- copy() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Get a deep copy of the matrix.
- copy() - Method in class org.apache.spark.ml.linalg.SparseMatrix
-
- copy() - Method in class org.apache.spark.ml.linalg.SparseVector
-
- copy() - Method in interface org.apache.spark.ml.linalg.Vector
-
Makes a deep copy of this vector.
- copy(ParamMap) - Method in class org.apache.spark.ml.Model
-
- copy() - Method in class org.apache.spark.ml.param.ParamMap
-
Creates a copy of this param map.
- copy(ParamMap) - Method in interface org.apache.spark.ml.param.Params
-
Creates a copy of this instance with the same UID and some extra params.
- copy(ParamMap) - Method in class org.apache.spark.ml.Pipeline
-
- copy(ParamMap) - Method in class org.apache.spark.ml.PipelineModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.PipelineStage
-
- copy(ParamMap) - Method in class org.apache.spark.ml.Predictor
-
- copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALS
-
- copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegression
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- copy(ParamMap) - Method in class org.apache.spark.ml.Transformer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
-
- copy(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
y = x
- copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- copy() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Get a deep copy of the matrix.
- copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- copy() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Makes a deep copy of this vector.
- copy() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
-
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the
class when applicable for non-locking concurrent usage.
- copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.WeibullGenerator
-
- copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Returns a shallow copy of this instance.
- copy() - Method in interface org.apache.spark.sql.Row
-
Make a copy of the current
Row
object.
- copy() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- copy() - Method in class org.apache.spark.sql.vectorized.ColumnarMap
-
- copy() - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
Revisit this.
- copy() - Method in class org.apache.spark.util.AccumulatorV2
-
Creates a new copy of this accumulator.
- copy() - Method in class org.apache.spark.util.CollectionAccumulator
-
- copy() - Method in class org.apache.spark.util.DoubleAccumulator
-
- copy() - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
-
- copy() - Method in class org.apache.spark.util.LongAccumulator
-
- copy() - Method in class org.apache.spark.util.StatCounter
-
Clone this StatCounter
- copyAndReset() - Method in class org.apache.spark.util.AccumulatorV2
-
Creates a new copy of this accumulator, which is zero value.
- copyAndReset() - Method in class org.apache.spark.util.CollectionAccumulator
-
- copyFileStreamNIO(FileChannel, FileChannel, long, long) - Static method in class org.apache.spark.util.Utils
-
- copyStream(InputStream, OutputStream, boolean, boolean) - Static method in class org.apache.spark.util.Utils
-
Copy all data from an InputStream to an OutputStream.
- copyValues(T, ParamMap) - Method in interface org.apache.spark.ml.param.Params
-
Copies param values from this instance to another instance for params shared by them.
- cores() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
-
- coresGranted() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-
- coresPerExecutor() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-
- corr(Dataset<?>, String, String) - Static method in class org.apache.spark.ml.stat.Correlation
-
Experimental
Compute the correlation matrix for the input Dataset of Vectors using the specified method.
- corr(Dataset<?>, String) - Static method in class org.apache.spark.ml.stat.Correlation
-
Compute the Pearson correlation matrix for the input Dataset of Vectors.
- corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
-
- corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the Pearson correlation matrix for the input RDD of Vectors.
- corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the correlation matrix for the input RDD of Vectors using the specified method.
- corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the Pearson correlation for the input RDDs.
- corr(JavaRDD<Double>, JavaRDD<Double>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Java-friendly version of corr()
- corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the correlation for the input RDDs using the specified method.
- corr(JavaRDD<Double>, JavaRDD<Double>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Java-friendly version of corr()
- corr(String, String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Calculates the correlation of two columns of a DataFrame.
- corr(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Calculates the Pearson Correlation Coefficient of two columns of a DataFrame.
- corr(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the Pearson Correlation Coefficient for two columns.
- corr(String, String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the Pearson Correlation Coefficient for two columns.
- Correlation - Class in org.apache.spark.ml.stat
-
API for correlation functions in MLlib, compatible with DataFrames and Datasets.
- Correlation() - Constructor for class org.apache.spark.ml.stat.Correlation
-
- Correlation - Interface in org.apache.spark.mllib.stat.correlation
-
Trait for correlation algorithms.
- CorrelationNames - Class in org.apache.spark.mllib.stat.correlation
-
Maintains supported and default correlation names.
- CorrelationNames() - Constructor for class org.apache.spark.mllib.stat.correlation.CorrelationNames
-
- Correlations - Class in org.apache.spark.mllib.stat.correlation
-
Delegates computation to the specific correlation object based on the input method name.
- Correlations() - Constructor for class org.apache.spark.mllib.stat.correlation.Correlations
-
- corrMatrix(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
-
- cos(Column) - Static method in class org.apache.spark.sql.functions
-
- cos(String) - Static method in class org.apache.spark.sql.functions
-
- cosh(Column) - Static method in class org.apache.spark.sql.functions
-
- cosh(String) - Static method in class org.apache.spark.sql.functions
-
- CosineSilhouette - Class in org.apache.spark.ml.evaluation
-
The algorithm which is implemented in this object, instead, is an efficient and parallel
implementation of the Silhouette using the cosine distance measure.
- CosineSilhouette() - Constructor for class org.apache.spark.ml.evaluation.CosineSilhouette
-
- count() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
The number of edges in the RDD.
- count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
The number of vertices in the RDD.
- count() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
-
- count() - Method in class org.apache.spark.ml.regression.AFTAggregator
-
- count(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
-
- count(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
-
- count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Sample size.
- count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Sample size.
- count() - Method in class org.apache.spark.rdd.RDD
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.sql.Dataset
-
Returns the number of rows in the Dataset.
- count(MapFunction<T, Object>) - Static method in class org.apache.spark.sql.expressions.javalang.typed
-
Count aggregate function.
- count(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed
-
Count aggregate function.
- count(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of items in a group.
- count(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of items in a group.
- count() - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
Returns a
Dataset
that contains a tuple with each key and the number of items present
for that key.
- count() - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Count the number of rows for each group.
- count() - Method in class org.apache.spark.status.RDDPartitionSeq
-
- count() - Method in class org.apache.spark.storage.ReadableChannelFileRegion
-
- count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.util.DoubleAccumulator
-
Returns the number of elements added to the accumulator.
- count() - Method in class org.apache.spark.util.LongAccumulator
-
Returns the number of elements added to the accumulator.
- count() - Method in class org.apache.spark.util.StatCounter
-
- countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
-
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of count
, which returns a
future for counting the number of elements in this RDD.
- countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for counting the number of elements in the RDD.
- countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Count the number of elements for each key, and return the result to the master as a Map.
- countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Count the number of elements for each key, collecting the results to a local Map.
- countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the count of each unique value in this RDD as a map of (value, count) pairs.
- countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the count of each unique value in this RDD as a local map of (value, count) pairs.
- countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Approximate version of countByValue().
- countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Approximate version of countByValue().
- countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Approximate version of countByValue().
- countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a window over this DStream.
- countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a sliding window over this DStream.
- countDistinct(Column, Column...) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- countDistinct(String, String...) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- countDistinct(Column, Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- countDistinct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- COUNTER() - Static method in class org.apache.spark.metrics.sink.StatsdMetricType
-
- CountingWritableChannel - Class in org.apache.spark.storage
-
- CountingWritableChannel(WritableByteChannel) - Constructor for class org.apache.spark.storage.CountingWritableChannel
-
- countMinSketch(String, int, int, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Builds a Count-min Sketch over a specified column.
- countMinSketch(String, double, double, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Builds a Count-min Sketch over a specified column.
- countMinSketch(Column, int, int, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Builds a Count-min Sketch over a specified column.
- countMinSketch(Column, double, double, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Builds a Count-min Sketch over a specified column.
- CountMinSketch - Class in org.apache.spark.util.sketch
-
A Count-min sketch is a probabilistic data structure used for cardinality estimation using
sub-linear space.
- CountMinSketch() - Constructor for class org.apache.spark.util.sketch.CountMinSketch
-
- CountMinSketch.Version - Enum in org.apache.spark.util.sketch
-
- countTowardsTaskFailures() - Method in class org.apache.spark.ExecutorLostFailure
-
- countTowardsTaskFailures() - Method in class org.apache.spark.FetchFailed
-
Fetch failures lead to a different failure handling path: (1) we don't abort the stage after
4 task failures, instead we immediately go back to the stage which generated the map output,
and regenerate the missing data.
- countTowardsTaskFailures() - Static method in class org.apache.spark.Resubmitted
-
- countTowardsTaskFailures() - Method in class org.apache.spark.TaskCommitDenied
-
If a task failed because its attempt to commit was denied, do not count this failure
towards failing the stage.
- countTowardsTaskFailures() - Method in interface org.apache.spark.TaskFailedReason
-
Whether this task failure should be counted towards the maximum number of times the task is
allowed to fail before the stage is aborted.
- countTowardsTaskFailures() - Method in class org.apache.spark.TaskKilled
-
- countTowardsTaskFailures() - Static method in class org.apache.spark.TaskResultLost
-
- countTowardsTaskFailures() - Static method in class org.apache.spark.UnknownReason
-
- CountVectorizer - Class in org.apache.spark.ml.feature
-
- CountVectorizer(String) - Constructor for class org.apache.spark.ml.feature.CountVectorizer
-
- CountVectorizer() - Constructor for class org.apache.spark.ml.feature.CountVectorizer
-
- CountVectorizerModel - Class in org.apache.spark.ml.feature
-
Converts a text document to a sparse vector of token counts.
- CountVectorizerModel(String, String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
-
- CountVectorizerModel(String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
-
- CountVectorizerParams - Interface in org.apache.spark.ml.feature
-
- cov() - Method in class org.apache.spark.ml.stat.distribution.MultivariateGaussian
-
- cov(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Calculate the sample covariance of two numerical columns of a DataFrame.
- covar_pop(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the population covariance for two columns.
- covar_pop(String, String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the population covariance for two columns.
- covar_samp(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sample covariance for two columns.
- covar_samp(String, String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sample covariance for two columns.
- covs() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
-
- crc32(Column) - Static method in class org.apache.spark.sql.functions
-
Calculates the cyclic redundancy check value (CRC32) of a binary column and
returns the value as a bigint.
- CreatableRelationProvider - Interface in org.apache.spark.sql.sources
-
- create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
-
Create a new StorageLevel object.
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD
-
Create an RDD that executes a SQL query on a JDBC connection and reads results.
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD
-
Create an RDD that executes a SQL query on a JDBC connection and reads results.
- create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
-
Create a PartitionPruningRDD.
- create(RpcEnvConfig) - Method in interface org.apache.spark.rpc.RpcEnvFactory
-
- create(Object, DataType, Seq<Option<ScalaReflection.Schema>>) - Static method in class org.apache.spark.sql.expressions.SparkUserDefinedFunction
-
- create(Object...) - Static method in class org.apache.spark.sql.RowFactory
-
Create a
Row
from the given arguments.
- create(String) - Static method in class org.apache.spark.sql.streaming.ProcessingTime
-
- create(long, TimeUnit) - Static method in class org.apache.spark.sql.streaming.ProcessingTime
-
- create(long) - Static method in class org.apache.spark.util.sketch.BloomFilter
-
Creates a
BloomFilter
with the expected number of insertions and a default expected
false positive probability of 3%.
- create(long, double) - Static method in class org.apache.spark.util.sketch.BloomFilter
-
Creates a
BloomFilter
with the expected number of insertions and expected false
positive probability.
- create(long, long) - Static method in class org.apache.spark.util.sketch.BloomFilter
-
Creates a
BloomFilter
with given
expectedNumItems
and
numBits
, it will
pick an optimal
numHashFunctions
which can minimize
fpp
for the bloom filter.
- create(int, int, int) - Static method in class org.apache.spark.util.sketch.CountMinSketch
-
- create(double, double, int) - Static method in class org.apache.spark.util.sketch.CountMinSketch
-
Creates a
CountMinSketch
with given relative error (
eps
),
confidence
,
and random
seed
.
- createArrayType(DataType) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates an ArrayType by specifying the data type of elements (elementType
).
- createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates an ArrayType by specifying the data type of elements (elementType
) and
whether the array contains null values (containsNull
).
- createAttrGroupForAttrNames(String, int, boolean, boolean) - Static method in class org.apache.spark.ml.feature.OneHotEncoderCommon
-
Creates an `AttributeGroup` with the required number of `BinaryAttribute`.
- createCombiner() - Method in class org.apache.spark.Aggregator
-
- createCommitter(int) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
-
- createCompiledClass(String, File, TestUtils.JavaSourceFromString, Seq<URL>) - Static method in class org.apache.spark.TestUtils
-
Creates a compiled class with the source file.
- createCompiledClass(String, File, String, String, Seq<URL>) - Static method in class org.apache.spark.TestUtils
-
Creates a compiled class with the given name.
- createContinuousReader(Optional<StructType>, String, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.ContinuousReadSupport
-
- createContinuousReader(PartitionOffset) - Method in interface org.apache.spark.sql.sources.v2.reader.ContinuousInputPartition
-
Create an input partition reader with particular offset as its startOffset.
- createCryptoInputStream(InputStream, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils
-
Helper method to wrap InputStream
with CryptoInputStream
for decryption.
- createCryptoOutputStream(OutputStream, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils
-
Helper method to wrap OutputStream
with CryptoOutputStream
for encryption.
- createDatabase(CatalogDatabase, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Creates a new database with the given name.
- createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Creates a DataFrame
from an RDD of Product (e.g.
- createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Creates a DataFrame
from a local Seq of Product.
- createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SparkSession
-
Developer API
Creates a
DataFrame
from an
RDD
containing
Row
s using the given schema.
- createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SparkSession
-
Developer API
Creates a
DataFrame
from a
JavaRDD
containing
Row
s using the given schema.
- createDataFrame(List<Row>, StructType) - Method in class org.apache.spark.sql.SparkSession
-
Developer API
Creates a
DataFrame
from a
java.util.List
containing
Row
s using the given schema.
- createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SparkSession
-
Applies a schema to an RDD of Java Beans.
- createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SparkSession
-
Applies a schema to an RDD of Java Beans.
- createDataFrame(List<?>, Class<?>) - Method in class org.apache.spark.sql.SparkSession
-
Applies a schema to a List of Java Beans.
- createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(List<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(List<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataset(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Creates a
Dataset
from a local Seq of data of a given type.
- createDataset(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Creates a
Dataset
from an RDD of a given type.
- createDataset(List<T>, Encoder<T>) - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Creates a
Dataset
from a
java.util.List
of a given type.
- createDataset(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataset(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataset(List<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataWriter(int, long, long) - Method in interface org.apache.spark.sql.sources.v2.writer.DataWriterFactory
-
Returns a data writer to do the actual writing work.
- createDecimalType(int, int) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a DecimalType by specifying the precision and scale.
- createDecimalType() - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a DecimalType with default precision and scale, which are 10 and 0.
- createDF(RDD<byte[]>, StructType, SparkSession) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- createDirectory(String, String) - Static method in class org.apache.spark.util.Utils
-
Create a directory inside the given parent directory.
- createdTempDir() - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
-
- createExternalTable(String, String) - Method in class org.apache.spark.sql.catalog.Catalog
-
- createExternalTable(String, String, String) - Method in class org.apache.spark.sql.catalog.Catalog
-
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
-
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
-
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
-
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
-
- createExternalTable(String, String) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, String) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- createFilter(StructType, Filter[]) - Static method in class org.apache.spark.sql.hive.orc.OrcFilters
-
- createFunction(String, CatalogFunction) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Create a function in an existing database.
- createGlobalTempView(String) - Method in class org.apache.spark.sql.Dataset
-
Creates a global temporary view using the given name.
- CreateHiveTableAsSelectCommand - Class in org.apache.spark.sql.hive.execution
-
Create table and insert the query result into it.
- CreateHiveTableAsSelectCommand(CatalogTable, LogicalPlan, Seq<String>, SaveMode) - Constructor for class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
-
- createJar(Seq<File>, File, Option<String>) - Static method in class org.apache.spark.TestUtils
-
Create a jar file that contains this set of files.
- createJarWithClasses(Seq<String>, String, Seq<Tuple2<String, String>>, Seq<URL>) - Static method in class org.apache.spark.TestUtils
-
Create a jar that defines classes with the given names.
- createJarWithFiles(Map<String, String>, File) - Static method in class org.apache.spark.TestUtils
-
Create a jar file containing multiple files.
- createJobContext(String, int) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
-
- createJobID(Date, int) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
-
- createJobTrackerID(Date) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
-
- createKey(SparkConf) - Static method in class org.apache.spark.security.CryptoStreamUtils
-
Creates a new encryption key.
- createListeners(SparkConf, ElementTrackingStore) - Method in interface org.apache.spark.status.AppHistoryServerPlugin
-
Creates listeners to replay the event logs.
- createLogForDriver(SparkConf, String, Configuration) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
Create a WriteAheadLog for the driver.
- createLogForReceiver(SparkConf, String, Configuration) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
Create a WriteAheadLog for the receiver.
- createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a MapType by specifying the data type of keys (keyType
) and values
(keyType
).
- createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a MapType by specifying the data type of keys (keyType
), the data type of
values (keyType
), and whether values contain any null value
(valueContainsNull
).
- createMetrics(long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long) - Static method in class org.apache.spark.status.LiveEntityHelpers
-
- createMetrics(long) - Static method in class org.apache.spark.status.LiveEntityHelpers
-
- createMicroBatchReader(Optional<StructType>, String, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.MicroBatchReadSupport
-
Creates a
MicroBatchReader
to read batches of data from this data source in a
streaming query.
- createModel(DenseVector<Object>) - Method in interface org.apache.spark.ml.ann.Layer
-
Returns the instance of the layer based on weights provided.
- createOrReplaceGlobalTempView(String) - Method in class org.apache.spark.sql.Dataset
-
Creates or replaces a global temporary view using the given name.
- createOrReplaceTempView(String) - Method in class org.apache.spark.sql.Dataset
-
Creates a local temporary view using the given name.
- createOutputOperationFailureForUI(String) - Static method in class org.apache.spark.streaming.ui.UIUtils
-
- createPartitionReader() - Method in interface org.apache.spark.sql.sources.v2.reader.InputPartition
-
Returns an input partition reader to do the actual reading work.
- createPartitions(String, String, Seq<CatalogTablePartition>, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Create one or many partitions in the given table.
- createPathFromString(String, JobConf) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
-
- createPMMLModelExport(Object) - Static method in class org.apache.spark.mllib.pmml.export.PMMLModelExportFactory
-
Factory object to help creating the necessary PMMLModelExport implementation
taking as input the machine learning model (for example KMeansModel).
- createProxyHandler(Function1<String, Option<String>>) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a handler for proxying request to Workers and Application Drivers
- createProxyLocationHeader(String, HttpServletRequest, URI) - Static method in class org.apache.spark.ui.JettyUtils
-
- createProxyURI(String, String, String, String) - Static method in class org.apache.spark.ui.JettyUtils
-
- createRDDFromArray(JavaSparkContext, byte[][]) - Static method in class org.apache.spark.api.r.RRDD
-
Create an RRDD given a sequence of byte arrays.
- createRDDFromFile(JavaSparkContext, String, int) - Static method in class org.apache.spark.api.r.RRDD
-
Create an RRDD given a temporary file name.
- createReadableChannel(ReadableByteChannel, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils
-
Wrap a ReadableByteChannel
for decryption.
- createReader(StructType, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.ReadSupport
-
- createReader(DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.ReadSupport
-
- createRedirectHandler(String, String, Function1<HttpServletRequest, BoxedUnit>, String, Set<String>) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a handler that always redirects the user to the given path
- createRelation(SQLContext, SaveMode, Map<String, String>, Dataset<Row>) - Method in interface org.apache.spark.sql.sources.CreatableRelationProvider
-
Saves a DataFrame to a destination (using data source-specific parameters)
- createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider
-
Returns a new base relation with the given parameters.
- createRelation(SQLContext, Map<String, String>, StructType) - Method in interface org.apache.spark.sql.sources.SchemaRelationProvider
-
Returns a new base relation with the given parameters and user defined schema.
- createSchedulerBackend(SparkContext, String, TaskScheduler) - Method in interface org.apache.spark.scheduler.ExternalClusterManager
-
Create a scheduler backend for the given SparkContext and scheduler.
- createSecret(SparkConf) - Static method in class org.apache.spark.util.Utils
-
- createServlet(JettyUtils.ServletParams<T>, org.apache.spark.SecurityManager, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils
-
- createServletHandler(String, JettyUtils.ServletParams<T>, org.apache.spark.SecurityManager, SparkConf, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a context handler that responds to a request with the given path prefix
- createServletHandler(String, HttpServlet, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a context handler that responds to a request with the given path prefix
- createSink(SQLContext, Map<String, String>, Seq<String>, OutputMode) - Method in interface org.apache.spark.sql.sources.StreamSinkProvider
-
- createSource(SQLContext, String, Option<StructType>, String, Map<String, String>) - Method in interface org.apache.spark.sql.sources.StreamSourceProvider
-
- createSparkContext(String, String, String, String[], Map<Object, Object>, Map<Object, Object>) - Static method in class org.apache.spark.api.r.RRDD
-
- createStaticHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a handler for serving files from a static directory
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, String, String, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, String, String, String, String, String, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>, String, String, String, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
- createStream(JavaStreamingContext, String, String, String, String, int, Duration, StorageLevel, String, String, String, String, String) - Method in class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
-
- createStreamWriter(String, StructType, OutputMode, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.StreamWriteSupport
-
Creates an optional
StreamWriter
to save the data to this data source.
- createStructField(String, String, boolean) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a StructField by specifying the name (name
), data type (dataType
) and
whether values of this field can be null values (nullable
).
- createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a StructField with empty metadata.
- createStructType(Seq<StructField>) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- createStructType(List<StructField>) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a StructType with the given list of StructFields (fields
).
- createStructType(StructField[]) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a StructType with the given StructField array (fields
).
- createTable(String, String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Experimental
Creates a table from the given path and returns the corresponding DataFrame.
- createTable(String, String, String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Experimental
Creates a table from the given path based on a data source and returns the corresponding
DataFrame.
- createTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
-
Experimental
Creates a table based on the dataset in a data source and a set of options.
- createTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
-
Experimental
(Scala-specific)
Creates a table based on the dataset in a data source and a set of options.
- createTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
-
Experimental
Create a table based on the dataset in a data source, a schema and a set of options.
- createTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
-
Experimental
(Scala-specific)
Create a table based on the dataset in a data source, a schema and a set of options.
- createTable(CatalogTable, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Creates a table with the given metadata.
- createTaskAttemptContext(String, int, int, int) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
-
- createTaskScheduler(SparkContext, String) - Method in interface org.apache.spark.scheduler.ExternalClusterManager
-
Create a task scheduler instance for the given SparkContext
- createTempDir(String, String) - Static method in class org.apache.spark.util.Utils
-
Create a temporary directory inside the given parent directory.
- createTempView(String) - Method in class org.apache.spark.sql.Dataset
-
Creates a local temporary view using the given name.
- createUnsafe(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal
-
Creates a decimal from unscaled, precision and scale without checking the bounds.
- createWorkspace(int) - Static method in class org.apache.spark.mllib.optimization.NNLS
-
- createWritableChannel(WritableByteChannel, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils
-
Wrap a WritableByteChannel
for encryption.
- createWriter(String, StructType, SaveMode, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.WriteSupport
-
- createWriterFactory() - Method in interface org.apache.spark.sql.sources.v2.writer.DataSourceWriter
-
Creates a writer factory which will be serialized and sent to executors.
- crossJoin(Dataset<?>) - Method in class org.apache.spark.sql.Dataset
-
Explicit cartesian join with another DataFrame
.
- crosstab(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Computes a pair-wise frequency table of the given columns.
- CrossValidator - Class in org.apache.spark.ml.tuning
-
K-fold cross validation performs model selection by splitting the dataset into a set of
non-overlapping randomly partitioned folds which are used as separate training and test datasets
e.g., with k=3 folds, K-fold cross validation will generate 3 (training, test) dataset pairs,
each of which uses 2/3 of the data for training and 1/3 for testing.
- CrossValidator(String) - Constructor for class org.apache.spark.ml.tuning.CrossValidator
-
- CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
-
- CrossValidatorModel - Class in org.apache.spark.ml.tuning
-
CrossValidatorModel contains the model with the highest average cross-validation
metric across folds and uses this model to transform input data.
- CrossValidatorModel.CrossValidatorModelWriter - Class in org.apache.spark.ml.tuning
-
Writer for CrossValidatorModel.
- CrossValidatorParams - Interface in org.apache.spark.ml.tuning
-
- CryptoStreamUtils - Class in org.apache.spark.security
-
A util class for manipulating IO encryption and decryption streams.
- CryptoStreamUtils() - Constructor for class org.apache.spark.security.CryptoStreamUtils
-
- CryptoStreamUtils.BaseErrorHandler - Interface in org.apache.spark.security
-
SPARK-25535.
- CryptoStreamUtils.ErrorHandlingReadableChannel - Class in org.apache.spark.security
-
- csv(String...) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads CSV files and returns the result as a DataFrame
.
- csv(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads a CSV file and returns the result as a DataFrame
.
- csv(Dataset<String>) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads an Dataset[String]
storing CSV rows and returns the result as a DataFrame
.
- csv(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads CSV files and returns the result as a DataFrame
.
- csv(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the DataFrame
in CSV format at the specified path.
- csv(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader
-
Loads a CSV file stream and returns the result as a DataFrame
.
- cube(Column...) - Method in class org.apache.spark.sql.Dataset
-
Create a multi-dimensional cube for the current Dataset using the specified columns,
so we can run aggregation on them.
- cube(String, String...) - Method in class org.apache.spark.sql.Dataset
-
Create a multi-dimensional cube for the current Dataset using the specified columns,
so we can run aggregation on them.
- cube(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Create a multi-dimensional cube for the current Dataset using the specified columns,
so we can run aggregation on them.
- cube(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Create a multi-dimensional cube for the current Dataset using the specified columns,
so we can run aggregation on them.
- CubeType$() - Constructor for class org.apache.spark.sql.RelationalGroupedDataset.CubeType$
-
- cume_dist() - Static method in class org.apache.spark.sql.functions
-
Window function: returns the cumulative distribution of values within a window partition,
i.e.
- current_date() - Static method in class org.apache.spark.sql.functions
-
Returns the current date as a date column.
- current_timestamp() - Static method in class org.apache.spark.sql.functions
-
Returns the current timestamp as a timestamp column.
- currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
-
- currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
-
- currentDatabase() - Method in class org.apache.spark.sql.catalog.Catalog
-
Returns the current default database in this session.
- currentResult() - Method in interface org.apache.spark.partial.ApproximateEvaluator
-
- currentRow() - Static method in class org.apache.spark.sql.expressions.Window
-
Value representing the current row.
- currentRow() - Static method in class org.apache.spark.sql.functions
-
- currPrefLocs(Partition, RDD<?>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- customMetrics() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress
-
- DAGSchedulerEvent - Interface in org.apache.spark.scheduler
-
Types of events that can be handled by the DAGScheduler.
- dapply(Dataset<Row>, byte[], byte[], Object[], StructType) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
The helper function for dapply() on R side.
- Data(Vector, double, Option<Object>) - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
-
- Data(double[], double[], double[][]) - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
-
- Data(double[], double[], double[][], String) - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
-
- Data(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data
-
- Data(Vector, double) - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
-
- data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
-
- data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- Data$() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data$
-
- Data$() - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data$
-
- Data$() - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data$
-
- Data$() - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data$
-
- Data$() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data$
-
- Database - Class in org.apache.spark.sql.catalog
-
A database in Spark, as returned by the
listDatabases
method defined in
Catalog
.
- Database(String, String, String) - Constructor for class org.apache.spark.sql.catalog.Database
-
- database() - Method in class org.apache.spark.sql.catalog.Function
-
- database() - Method in class org.apache.spark.sql.catalog.Table
-
- DATABASE_KEY - Static variable in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
The option key for database name.
- databaseExists(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Check if the database with the specified name exists.
- databaseExists(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Return whether a table/view with the specified name exists.
- databaseName() - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
Returns the value of the database name option.
- databaseTypeDefinition() - Method in class org.apache.spark.sql.jdbc.JdbcType
-
- dataDistribution() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- DATAFRAME_DAPPLY() - Static method in class org.apache.spark.api.r.RRunnerModes
-
- DATAFRAME_GAPPLY() - Static method in class org.apache.spark.api.r.RRunnerModes
-
- DataFrameNaFunctions - Class in org.apache.spark.sql
-
Functionality for working with missing data in DataFrame
s.
- DataFrameReader - Class in org.apache.spark.sql
-
Interface used to load a
Dataset
from external storage systems (e.g.
- DataFrameStatFunctions - Class in org.apache.spark.sql
-
Statistic functions for DataFrame
s.
- DataFrameWriter<T> - Class in org.apache.spark.sql
-
Interface used to write a
Dataset
to external storage systems (e.g.
- Dataset<T> - Class in org.apache.spark.sql
-
A Dataset is a strongly typed collection of domain-specific objects that can be transformed
in parallel using functional or relational operations.
- Dataset(SparkSession, LogicalPlan, Encoder<T>) - Constructor for class org.apache.spark.sql.Dataset
-
- Dataset(SQLContext, LogicalPlan, Encoder<T>) - Constructor for class org.apache.spark.sql.Dataset
-
- DatasetHolder<T> - Class in org.apache.spark.sql
-
A container for a
Dataset
, used for implicit conversions in Scala.
- DatasetUtils - Class in org.apache.spark.ml.util
-
- DatasetUtils() - Constructor for class org.apache.spark.ml.util.DatasetUtils
-
- dataSource() - Method in interface org.apache.spark.ui.PagedTable
-
- DataSourceOptions - Class in org.apache.spark.sql.sources.v2
-
An immutable string-to-string map in which keys are case-insensitive.
- DataSourceOptions(Map<String, String>) - Constructor for class org.apache.spark.sql.sources.v2.DataSourceOptions
-
- DataSourceReader - Interface in org.apache.spark.sql.sources.v2.reader
-
- DataSourceRegister - Interface in org.apache.spark.sql.sources
-
Data sources should implement this trait so that they can register an alias to their data source.
- DataSourceV2 - Interface in org.apache.spark.sql.sources.v2
-
The base interface for data source v2.
- DataSourceWriter - Interface in org.apache.spark.sql.sources.v2.writer
-
- DataStreamReader - Class in org.apache.spark.sql.streaming
-
Interface used to load a streaming Dataset
from external storage systems (e.g.
- DataStreamWriter<T> - Class in org.apache.spark.sql.streaming
-
Interface used to write a streaming Dataset
to external storage systems (e.g.
- dataTablesHeaderNodes(HttpServletRequest) - Static method in class org.apache.spark.ui.UIUtils
-
- dataType() - Method in class org.apache.spark.sql.catalog.Column
-
- dataType() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
-
- dataType() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
-
- DataType - Class in org.apache.spark.sql.types
-
The base type of all Spark SQL data types.
- DataType() - Constructor for class org.apache.spark.sql.types.DataType
-
- dataType() - Method in class org.apache.spark.sql.types.StructField
-
- dataType() - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the data type of this column vector.
- DataTypes - Class in org.apache.spark.sql.types
-
To get/create specific data type, users should use singleton objects and factory methods
provided by this class.
- DataTypes() - Constructor for class org.apache.spark.sql.types.DataTypes
-
- DataValidators - Class in org.apache.spark.mllib.util
-
Developer API
A collection of methods used to validate data before applying ML algorithms.
- DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
-
- DataWriter<T> - Interface in org.apache.spark.sql.sources.v2.writer
-
- DataWriterFactory<T> - Interface in org.apache.spark.sql.sources.v2.writer
-
- date() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type date.
- DATE() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for nullable date type.
- date_add(Column, int) - Static method in class org.apache.spark.sql.functions
-
Returns the date that is days
days after start
- date_format(Column, String) - Static method in class org.apache.spark.sql.functions
-
Converts a date/timestamp/string to a value of string in the format specified by the date
format given by the second argument.
- date_sub(Column, int) - Static method in class org.apache.spark.sql.functions
-
Returns the date that is days
days before start
- date_trunc(String, Column) - Static method in class org.apache.spark.sql.functions
-
Returns timestamp truncated to the unit specified by the format.
- datediff(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the number of days from start
to end
.
- DateType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the DateType object.
- DateType - Class in org.apache.spark.sql.types
-
A date type, supporting "0001-01-01" through "9999-12-31".
- DateType() - Constructor for class org.apache.spark.sql.types.DateType
-
- dayofmonth(Column) - Static method in class org.apache.spark.sql.functions
-
Extracts the day of the month as an integer from a given date/timestamp/string.
- dayofweek(Column) - Static method in class org.apache.spark.sql.functions
-
Extracts the day of the week as an integer from a given date/timestamp/string.
- dayofyear(Column) - Static method in class org.apache.spark.sql.functions
-
Extracts the day of the year as an integer from a given date/timestamp/string.
- DB2Dialect - Class in org.apache.spark.sql.jdbc
-
- DB2Dialect() - Constructor for class org.apache.spark.sql.jdbc.DB2Dialect
-
- DCT - Class in org.apache.spark.ml.feature
-
A feature transformer that takes the 1D discrete cosine transform of a real vector.
- DCT(String) - Constructor for class org.apache.spark.ml.feature.DCT
-
- DCT() - Constructor for class org.apache.spark.ml.feature.DCT
-
- deallocate() - Method in class org.apache.spark.storage.ReadableChannelFileRegion
-
- decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- decimal() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type decimal.
- decimal(int, int) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type decimal.
- DECIMAL() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for nullable decimal type.
- Decimal - Class in org.apache.spark.sql.types
-
A mutable implementation of BigDecimal that can hold a Long if values are small enough.
- Decimal() - Constructor for class org.apache.spark.sql.types.Decimal
-
- Decimal.DecimalAsIfIntegral$ - Class in org.apache.spark.sql.types
-
A Integral
evidence parameter for Decimals.
- Decimal.DecimalIsConflicted - Interface in org.apache.spark.sql.types
-
Common methods for Decimal evidence parameters
- Decimal.DecimalIsFractional$ - Class in org.apache.spark.sql.types
-
A Fractional
evidence parameter for Decimals.
- DecimalAsIfIntegral$() - Constructor for class org.apache.spark.sql.types.Decimal.DecimalAsIfIntegral$
-
- DecimalIsFractional$() - Constructor for class org.apache.spark.sql.types.Decimal.DecimalIsFractional$
-
- DecimalType - Class in org.apache.spark.sql.types
-
The data type representing java.math.BigDecimal
values.
- DecimalType(int, int) - Constructor for class org.apache.spark.sql.types.DecimalType
-
- DecimalType(int) - Constructor for class org.apache.spark.sql.types.DecimalType
-
- DecimalType() - Constructor for class org.apache.spark.sql.types.DecimalType
-
- DecimalType.Expression$ - Class in org.apache.spark.sql.types
-
- DecimalType.Fixed$ - Class in org.apache.spark.sql.types
-
- decimalTypeInfoToCatalyst(PrimitiveObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- DecisionTree - Class in org.apache.spark.mllib.tree
-
A class which implements a decision tree learning algorithm for classification and regression.
- DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
-
- DecisionTreeClassificationModel - Class in org.apache.spark.ml.classification
-
Decision tree model (http://en.wikipedia.org/wiki/Decision_tree_learning) for classification.
- DecisionTreeClassifier - Class in org.apache.spark.ml.classification
-
Decision tree learning algorithm (http://en.wikipedia.org/wiki/Decision_tree_learning)
for classification.
- DecisionTreeClassifier(String) - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- DecisionTreeClassifier() - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- DecisionTreeClassifierParams - Interface in org.apache.spark.ml.tree
-
- DecisionTreeModel - Interface in org.apache.spark.ml.tree
-
Abstraction for Decision Tree models.
- DecisionTreeModel - Class in org.apache.spark.mllib.tree.model
-
Decision tree model for classification or regression.
- DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- DecisionTreeModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.tree.model
-
- DecisionTreeModel.SaveLoadV1_0$.NodeData - Class in org.apache.spark.mllib.tree.model
-
Model data for model import/export
- DecisionTreeModel.SaveLoadV1_0$.NodeData$ - Class in org.apache.spark.mllib.tree.model
-
- DecisionTreeModel.SaveLoadV1_0$.PredictData - Class in org.apache.spark.mllib.tree.model
-
- DecisionTreeModel.SaveLoadV1_0$.PredictData$ - Class in org.apache.spark.mllib.tree.model
-
- DecisionTreeModel.SaveLoadV1_0$.SplitData - Class in org.apache.spark.mllib.tree.model
-
- DecisionTreeModel.SaveLoadV1_0$.SplitData$ - Class in org.apache.spark.mllib.tree.model
-
- DecisionTreeModelReadWrite - Class in org.apache.spark.ml.tree
-
Helper classes for tree model persistence
- DecisionTreeModelReadWrite() - Constructor for class org.apache.spark.ml.tree.DecisionTreeModelReadWrite
-
- DecisionTreeModelReadWrite.NodeData - Class in org.apache.spark.ml.tree
-
- DecisionTreeModelReadWrite.NodeData$ - Class in org.apache.spark.ml.tree
-
- DecisionTreeModelReadWrite.SplitData - Class in org.apache.spark.ml.tree
-
- DecisionTreeModelReadWrite.SplitData$ - Class in org.apache.spark.ml.tree
-
- DecisionTreeParams - Interface in org.apache.spark.ml.tree
-
Parameters for Decision Tree-based algorithms.
- DecisionTreeRegressionModel - Class in org.apache.spark.ml.regression
-
- DecisionTreeRegressor - Class in org.apache.spark.ml.regression
-
- DecisionTreeRegressor(String) - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- DecisionTreeRegressor() - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- DecisionTreeRegressorParams - Interface in org.apache.spark.ml.tree
-
- decode(Column, String) - Static method in class org.apache.spark.sql.functions
-
Computes the first argument into a string from a binary using the provided character set
(one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16').
- decodeFileNameInURI(URI) - Static method in class org.apache.spark.util.Utils
-
Get the file name from uri's raw path and decode it.
- decodeStructField(StructField, boolean) - Method in interface org.apache.spark.ml.attribute.AttributeFactory
-
Creates an
Attribute
from a
StructField
instance, optionally preserving name.
- decodeURLParameter(String) - Static method in class org.apache.spark.ui.UIUtils
-
Decode URLParameter if URL is encoded by YARN-WebAppProxyServlet.
- DEFAULT_CONNECTION_TIMEOUT() - Static method in class org.apache.spark.api.r.SparkRDefaults
-
- DEFAULT_DRIVER_MEM_MB() - Static method in class org.apache.spark.util.Utils
-
Define a default value for driver memory here since this value is referenced across the code
base and nearly all files already use Utils.scala
- DEFAULT_HEARTBEAT_INTERVAL() - Static method in class org.apache.spark.api.r.SparkRDefaults
-
- DEFAULT_MAX_FAILURES() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- DEFAULT_MAX_TO_STRING_FIELDS() - Static method in class org.apache.spark.util.Utils
-
The performance overhead of creating and logging strings for wide schemas can be large.
- DEFAULT_NUM_RBACKEND_THREADS() - Static method in class org.apache.spark.api.r.SparkRDefaults
-
- DEFAULT_NUMBER_EXECUTORS() - Static method in class org.apache.spark.scheduler.cluster.SchedulerBackendUtils
-
- DEFAULT_ROLLING_INTERVAL_SECS() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- DEFAULT_SHUTDOWN_PRIORITY() - Static method in class org.apache.spark.util.ShutdownHookManager
-
- defaultAttr() - Static method in class org.apache.spark.ml.attribute.BinaryAttribute
-
The default binary attribute.
- defaultAttr() - Static method in class org.apache.spark.ml.attribute.NominalAttribute
-
The default nominal attribute.
- defaultAttr() - Static method in class org.apache.spark.ml.attribute.NumericAttribute
-
The default numeric attribute.
- defaultCopy(ParamMap) - Method in interface org.apache.spark.ml.param.Params
-
Default implementation of copy with extra params.
- defaultCorrName() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
-
- DefaultCredentials - Class in org.apache.spark.streaming.kinesis
-
Returns DefaultAWSCredentialsProviderChain for authentication.
- DefaultCredentials() - Constructor for class org.apache.spark.streaming.kinesis.DefaultCredentials
-
- defaultLink() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
-
- defaultLink() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
-
- defaultLink() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
-
- defaultLink() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
-
- defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
- defaultMinPartitions() - Method in class org.apache.spark.SparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2.
- defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Default level of parallelism to use when not given by user (e.g.
- defaultParallelism() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- defaultParallelism() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- defaultParallelism() - Method in class org.apache.spark.SparkContext
-
Default level of parallelism to use when not given by user (e.g.
- defaultParamMap() - Method in interface org.apache.spark.ml.param.Params
-
Internal param map for default values.
- defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
Returns default configuration for the boosting algorithm
- defaultParams(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
Returns default configuration for the boosting algorithm
- DefaultParamsReadable<T> - Interface in org.apache.spark.ml.util
-
Developer API
- DefaultParamsWritable - Interface in org.apache.spark.ml.util
-
Developer API
- DefaultPartitionCoalescer - Class in org.apache.spark.rdd
-
Coalesce the partitions of a parent RDD (prev
) into fewer partitions, so that each partition of
this RDD computes one or more of the parent ones.
- DefaultPartitionCoalescer(double) - Constructor for class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- DefaultPartitionCoalescer.PartitionLocations - Class in org.apache.spark.rdd
-
- defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner
-
Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
- defaultSize() - Method in class org.apache.spark.sql.types.ArrayType
-
The default size of a value of the ArrayType is the default size of the element type.
- defaultSize() - Method in class org.apache.spark.sql.types.BinaryType
-
The default size of a value of the BinaryType is 100 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.BooleanType
-
The default size of a value of the BooleanType is 1 byte.
- defaultSize() - Method in class org.apache.spark.sql.types.ByteType
-
The default size of a value of the ByteType is 1 byte.
- defaultSize() - Method in class org.apache.spark.sql.types.CalendarIntervalType
-
- defaultSize() - Method in class org.apache.spark.sql.types.DataType
-
The default size of a value of this data type, used internally for size estimation.
- defaultSize() - Method in class org.apache.spark.sql.types.DateType
-
The default size of a value of the DateType is 4 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.DecimalType
-
The default size of a value of the DecimalType is 8 bytes when precision is at most 18,
and 16 bytes otherwise.
- defaultSize() - Method in class org.apache.spark.sql.types.DoubleType
-
The default size of a value of the DoubleType is 8 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.FloatType
-
The default size of a value of the FloatType is 4 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.HiveStringType
-
- defaultSize() - Method in class org.apache.spark.sql.types.IntegerType
-
The default size of a value of the IntegerType is 4 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.LongType
-
The default size of a value of the LongType is 8 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.MapType
-
The default size of a value of the MapType is
(the default size of the key type + the default size of the value type).
- defaultSize() - Method in class org.apache.spark.sql.types.NullType
-
- defaultSize() - Method in class org.apache.spark.sql.types.ObjectType
-
- defaultSize() - Method in class org.apache.spark.sql.types.ShortType
-
The default size of a value of the ShortType is 2 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.StringType
-
The default size of a value of the StringType is 20 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.StructType
-
The default size of a value of the StructType is the total default sizes of all field types.
- defaultSize() - Method in class org.apache.spark.sql.types.TimestampType
-
The default size of a value of the TimestampType is 8 bytes.
- defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- defaultStrategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- DefaultTopologyMapper - Class in org.apache.spark.storage
-
A TopologyMapper that assumes all nodes are in the same rack
- DefaultTopologyMapper(SparkConf) - Constructor for class org.apache.spark.storage.DefaultTopologyMapper
-
- defaultValue() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefault
-
- defaultValue() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultFunction
-
- defaultValue() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultString
-
- defaultValueString() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefault
-
- defaultValueString() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultFunction
-
- defaultValueString() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultString
-
- degree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-
The polynomial degree to expand, which should be greater than equal to 1.
- degrees() - Method in class org.apache.spark.graphx.GraphOps
-
The degree of each vertex in the graph.
- degrees(Column) - Static method in class org.apache.spark.sql.functions
-
Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
- degrees(String) - Static method in class org.apache.spark.sql.functions
-
Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
- degreesOfFreedom() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
Degrees of freedom.
- degreesOfFreedom() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
Degrees of freedom
- degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
-
- degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
Returns the degree(s) of freedom of the hypothesis test.
- delegate() - Method in class org.apache.spark.InterruptibleIterator
-
- deleteCheckpointFiles() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
-
Developer API
- deleteExternalTmpPath(Configuration) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
-
- deleteRecursively(File) - Static method in class org.apache.spark.util.Utils
-
Delete a file or directory and its contents recursively.
- deleteWithJob(FileSystem, Path, boolean) - Method in class org.apache.spark.internal.io.FileCommitProtocol
-
Specifies that a file should be deleted with the commit of this job.
- delimiterOptions() - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- delta() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Tweedie$
-
Constant used in initialization and deviance to avoid numerical issues.
- dense(int, int, double[]) - Static method in class org.apache.spark.ml.linalg.Matrices
-
Creates a column-major dense matrix.
- dense(double, double...) - Static method in class org.apache.spark.ml.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double, Seq<Object>) - Static method in class org.apache.spark.ml.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double[]) - Static method in class org.apache.spark.ml.linalg.Vectors
-
Creates a dense vector from a double array.
- dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Creates a column-major dense matrix.
- dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from a double array.
- dense_rank() - Static method in class org.apache.spark.sql.functions
-
Window function: returns the rank of rows within a window partition, without any gaps.
- DenseMatrix - Class in org.apache.spark.ml.linalg
-
Column-major dense matrix.
- DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.ml.linalg.DenseMatrix
-
- DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.ml.linalg.DenseMatrix
-
Column-major dense matrix.
- DenseMatrix - Class in org.apache.spark.mllib.linalg
-
Column-major dense matrix.
- DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
-
- DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
-
Column-major dense matrix.
- DenseVector - Class in org.apache.spark.ml.linalg
-
A dense vector represented by a value array.
- DenseVector(double[]) - Constructor for class org.apache.spark.ml.linalg.DenseVector
-
- DenseVector - Class in org.apache.spark.mllib.linalg
-
A dense vector represented by a value array.
- DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
-
- dependencies() - Method in class org.apache.spark.rdd.RDD
-
Get the list of dependencies of this RDD, taking into account whether the
RDD is checkpointed or not.
- dependencies() - Method in class org.apache.spark.streaming.dstream.DStream
-
List of parent DStreams on which this DStream depends on
- dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- Dependency<T> - Class in org.apache.spark
-
Developer API
Base class for dependencies.
- Dependency() - Constructor for class org.apache.spark.Dependency
-
- DEPLOY_MODE - Static variable in class org.apache.spark.launcher.SparkLauncher
-
The Spark deploy mode.
- deployMode() - Method in class org.apache.spark.SparkContext
-
- depth() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel
-
Depth of the tree.
- depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Get depth of tree.
- depth() - Method in class org.apache.spark.util.sketch.CountMinSketch
-
- DerbyDialect - Class in org.apache.spark.sql.jdbc
-
- DerbyDialect() - Constructor for class org.apache.spark.sql.jdbc.DerbyDialect
-
- deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$
-
- deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$
-
- deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$
-
- deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$
-
- deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Logit$
-
- deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Probit$
-
- deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$
-
- derivative() - Method in interface org.apache.spark.ml.ann.ActivationFunction
-
Implements a derivative of a function (needed for the back propagation)
- desc() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- desc() - Method in class org.apache.spark.sql.Column
-
Returns a sort expression based on the descending order of the column.
- desc(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on the descending order of the column.
- desc() - Method in class org.apache.spark.util.MethodIdentifier
-
- desc_nulls_first() - Method in class org.apache.spark.sql.Column
-
Returns a sort expression based on the descending order of the column,
and null values appear before non-null values.
- desc_nulls_first(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on the descending order of the column,
and null values appear before non-null values.
- desc_nulls_last() - Method in class org.apache.spark.sql.Column
-
Returns a sort expression based on the descending order of the column,
and null values appear after non-null values.
- desc_nulls_last(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on the descending order of the column,
and null values appear after non-null values.
- describe(String...) - Method in class org.apache.spark.sql.Dataset
-
Computes basic statistics for numeric and string columns, including count, mean, stddev, min,
and max.
- describe(Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Computes basic statistics for numeric and string columns, including count, mean, stddev, min,
and max.
- describeTopics(int) - Method in class org.apache.spark.ml.clustering.LDAModel
-
Return the topics described by their top-weighted terms.
- describeTopics() - Method in class org.apache.spark.ml.clustering.LDAModel
-
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Return the topics described by weighted terms.
- describeTopics() - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Return the topics described by weighted terms.
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- description() - Method in class org.apache.spark.ExceptionFailure
-
- description() - Method in class org.apache.spark.sql.catalog.Column
-
- description() - Method in class org.apache.spark.sql.catalog.Database
-
- description() - Method in class org.apache.spark.sql.catalog.Function
-
- description() - Method in class org.apache.spark.sql.catalog.Table
-
- description() - Method in class org.apache.spark.sql.streaming.SinkProgress
-
- description() - Method in class org.apache.spark.sql.streaming.SourceProgress
-
- description() - Method in class org.apache.spark.status.api.v1.JobData
-
- description() - Method in class org.apache.spark.status.api.v1.StageData
-
- description() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
-
- description() - Method in class org.apache.spark.status.LiveStage
-
- description() - Method in class org.apache.spark.storage.StorageLevel
-
- description() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-
- DESER_CPU_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
-
- DESER_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
-
- DeserializationStream - Class in org.apache.spark.serializer
-
Developer API
A stream for reading serialized objects.
- DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
-
- deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
-
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
-
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserialize(byte[]) - Static method in class org.apache.spark.util.Utils
-
Deserialize an object using Java serialization
- deserialize(byte[], ClassLoader) - Static method in class org.apache.spark.util.Utils
-
Deserialize an object using Java serialization and the given ClassLoader
- deserialized() - Method in class org.apache.spark.storage.StorageLevel
-
- DeserializedMemoryEntry<T> - Class in org.apache.spark.storage.memory
-
- DeserializedMemoryEntry(Object, long, ClassTag<T>) - Constructor for class org.apache.spark.storage.memory.DeserializedMemoryEntry
-
- DeserializedValuesHolder<T> - Class in org.apache.spark.storage.memory
-
A holder for storing the deserialized values.
- DeserializedValuesHolder(ClassTag<T>) - Constructor for class org.apache.spark.storage.memory.DeserializedValuesHolder
-
- deserializeLongValue(byte[]) - Static method in class org.apache.spark.util.Utils
-
Deserialize a Long value (used for PythonPartitioner
)
- deserializeOffset(String) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader
-
Deserialize a JSON string into an Offset of the implementation-defined offset type.
- deserializeOffset(String) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader
-
Deserialize a JSON string into an Offset of the implementation-defined offset type.
- DeserializerLock - Class in org.apache.spark.sql.hive
-
Object to synchronize on when calling org.apache.hadoop.hive.serde2.Deserializer#initialize.
- DeserializerLock() - Constructor for class org.apache.spark.sql.hive.DeserializerLock
-
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.DummySerializerInstance
-
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserializeViaNestedStream(InputStream, SerializerInstance, Function1<DeserializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Deserialize via nested stream using specific serializer
- destroy() - Method in class org.apache.spark.broadcast.Broadcast
-
Destroy all data and metadata related to this broadcast variable.
- details() - Method in class org.apache.spark.scheduler.StageInfo
-
- details() - Method in class org.apache.spark.status.api.v1.StageData
-
- DETERMINATE() - Static method in class org.apache.spark.rdd.DeterministicLevel
-
- determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
-
Determines the bounds for range partitioning from candidates with weights indicating how many
items each represents.
- DetermineTableStats - Class in org.apache.spark.sql.hive
-
- DetermineTableStats(SparkSession) - Constructor for class org.apache.spark.sql.hive.DetermineTableStats
-
- deterministic() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
-
Returns true iff this function is deterministic, i.e.
- deterministic() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
-
Returns true iff the UDF is deterministic, i.e.
- DeterministicLevel - Class in org.apache.spark.rdd
-
The deterministic level of RDD's output (i.e.
- DeterministicLevel() - Constructor for class org.apache.spark.rdd.DeterministicLevel
-
- deviance(double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
-
- deviance(double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
-
- deviance(double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
-
- deviance(double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
-
- deviance() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
The deviance for the fitted model.
- devianceResiduals() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
The weighted residuals, the usual residuals rescaled by
the square root of the instance weights.
- dfToCols(Dataset<Row>) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- dfToRowRDD(Dataset<Row>) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- dgemm(double, DenseMatrix<Object>, DenseMatrix<Object>, double, DenseMatrix<Object>) - Static method in class org.apache.spark.ml.ann.BreezeUtil
-
DGEMM: C := alpha * A * B + beta * C
- dgemv(double, DenseMatrix<Object>, DenseVector<Object>, double, DenseVector<Object>) - Static method in class org.apache.spark.ml.ann.BreezeUtil
-
DGEMV: y := alpha * A * x + beta * y
- diag(Vector) - Static method in class org.apache.spark.ml.linalg.DenseMatrix
-
Generate a diagonal matrix in DenseMatrix
format from the supplied values.
- diag(Vector) - Static method in class org.apache.spark.ml.linalg.Matrices
-
Generate a diagonal matrix in Matrix
format from the supplied values.
- diag(Vector) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a diagonal matrix in DenseMatrix
format from the supplied values.
- diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a diagonal matrix in Matrix
format from the supplied values.
- diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD
-
For each vertex present in both this
and other
, diff
returns only those vertices with
differing values; for values that are different, keeps the values from other
.
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
-
For each vertex present in both this
and other
, diff
returns only those vertices with
differing values; for values that are different, keeps the values from other
.
- DifferentiableLossAggregator<Datum,Agg extends DifferentiableLossAggregator<Datum,Agg>> - Interface in org.apache.spark.ml.optim.aggregator
-
A parent trait for aggregators used in fitting MLlib models.
- DifferentiableRegularization<T> - Interface in org.apache.spark.ml.optim.loss
-
A Breeze diff function which represents a cost function for differentiable regularization
of parameters.
- dim() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator
-
The dimension of the gradient array.
- dir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- directory(File) - Method in class org.apache.spark.launcher.SparkLauncher
-
Sets the working directory of spark-submit.
- disableOutputSpecValidation() - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
-
Allows for the spark.hadoop.validateOutputSpecs
checks to be disabled on a case-by-case
basis; see SPARK-4835 for more details.
- disconnect() - Method in interface org.apache.spark.launcher.SparkAppHandle
-
Disconnects the handle from the application, without stopping it.
- DISK_BYTES_SPILLED() - Static method in class org.apache.spark.InternalAccumulator
-
- DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-
- DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-
- DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- DISK_SPILL() - Static method in class org.apache.spark.status.TaskIndexNames
-
- DiskBlockData - Class in org.apache.spark.storage
-
- DiskBlockData(long, long, File, long) - Constructor for class org.apache.spark.storage.DiskBlockData
-
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
-
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- diskSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- diskSize() - Method in class org.apache.spark.storage.BlockStatus
-
- diskSize() - Method in class org.apache.spark.storage.BlockUpdatedInfo
-
- diskSize() - Method in class org.apache.spark.storage.RDDInfo
-
- diskUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- diskUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-
- diskUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
-
- diskUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- diskUsed() - Method in class org.apache.spark.status.LiveExecutor
-
- diskUsed() - Method in class org.apache.spark.status.LiveRDD
-
- diskUsed() - Method in class org.apache.spark.status.LiveRDDDistribution
-
- diskUsed() - Method in class org.apache.spark.status.LiveRDDPartition
-
- dispersion() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
The dispersion of the fitted model.
- dispose() - Method in interface org.apache.spark.storage.BlockData
-
- dispose() - Method in class org.apache.spark.storage.DiskBlockData
-
- dispose(ByteBuffer) - Static method in class org.apache.spark.storage.StorageUtils
-
Attempt to clean up a ByteBuffer if it is direct or memory-mapped.
- distanceMeasure() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
param for distance measure to be used in evaluation
(supports "squaredEuclidean"
(default), "cosine"
)
- distanceMeasure() - Method in interface org.apache.spark.ml.param.shared.HasDistanceMeasure
-
Param for The distance measure.
- distanceMeasure() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
- distanceMeasure() - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
- distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset that contains only the unique rows from this Dataset.
- distinct(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
-
Creates a Column
for this UDAF using the distinct values of the given
Column
s as input arguments.
- distinct(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
-
Creates a Column
for this UDAF using the distinct values of the given
Column
s as input arguments.
- DistributedLDAModel - Class in org.apache.spark.ml.clustering
-
Distributed model fitted by
LDA
.
- DistributedLDAModel - Class in org.apache.spark.mllib.clustering
-
Distributed LDA model.
- DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed
-
Represents a distributively stored matrix backed by one or more RDDs.
- Distribution - Interface in org.apache.spark.sql.sources.v2.reader.partitioning
-
An interface to represent data distribution requirement, which specifies how the records should
be distributed among the data partitions (one
InputPartitionReader
outputs data for one
partition).
- distribution(LiveExecutor) - Method in class org.apache.spark.status.LiveRDD
-
- distributionOpt(LiveExecutor) - Method in class org.apache.spark.status.LiveRDD
-
- div(Decimal, Decimal) - Method in class org.apache.spark.sql.types.Decimal.DecimalIsFractional$
-
- div(Duration) - Method in class org.apache.spark.streaming.Duration
-
- divide(Object) - Method in class org.apache.spark.sql.Column
-
Division this expression by another expression.
- doc() - Method in class org.apache.spark.ml.param.Param
-
- docConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
Concentration parameter (commonly named "alpha") for the prior placed on documents'
distributions over topics ("theta").
- docConcentration() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
- docConcentration() - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Concentration parameter (commonly named "alpha") for the prior placed on documents'
distributions over topics ("theta").
- docConcentration() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- DocumentFrequencyAggregator(int) - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
- DocumentFrequencyAggregator() - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
- doesDirectoryContainAnyNewFiles(File, long) - Static method in class org.apache.spark.util.Utils
-
Determines if a directory contains any files newer than cutoff seconds.
- doFetchFile(String, File, String, SparkConf, org.apache.spark.SecurityManager, Configuration) - Static method in class org.apache.spark.util.Utils
-
Download a file or directory to target directory.
- doPostEvent(SparkListenerInterface, SparkListenerEvent) - Method in interface org.apache.spark.scheduler.SparkListenerBus
-
- doPostEvent(L, E) - Method in interface org.apache.spark.util.ListenerBus
-
Post an event to the specified listener.
- Dot - Class in org.apache.spark.ml.feature
-
- Dot() - Constructor for class org.apache.spark.ml.feature.Dot
-
- dot(Vector, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS
-
dot(x, y)
- dot(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
dot(x, y)
- doTest(DStream<Tuple2<StatCounter, StatCounter>>) - Method in interface org.apache.spark.mllib.stat.test.StreamingTestMethod
-
Perform streaming 2-sample statistical significance testing.
- doTest(DStream<Tuple2<StatCounter, StatCounter>>) - Static method in class org.apache.spark.mllib.stat.test.StudentTTest
-
- doTest(DStream<Tuple2<StatCounter, StatCounter>>) - Static method in class org.apache.spark.mllib.stat.test.WelchTTest
-
- DOUBLE() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for nullable double type.
- doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- doubleAccumulator() - Method in class org.apache.spark.SparkContext
-
Create and register a double accumulator, which starts with 0 and accumulates inputs by add
.
- doubleAccumulator(String) - Method in class org.apache.spark.SparkContext
-
Create and register a double accumulator, which starts with 0 and accumulates inputs by add
.
- DoubleAccumulator - Class in org.apache.spark.util
-
An
accumulator
for computing sum, count, and averages for double precision
floating numbers.
- DoubleAccumulator() - Constructor for class org.apache.spark.util.DoubleAccumulator
-
- DoubleAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-
Deprecated.
- DoubleArrayArrayParam - Class in org.apache.spark.ml.param
-
Developer API
Specialized version of Param[Array[Array[Double}]
for Java.
- DoubleArrayArrayParam(Params, String, String, Function1<double[][], Object>) - Constructor for class org.apache.spark.ml.param.DoubleArrayArrayParam
-
- DoubleArrayArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleArrayArrayParam
-
- DoubleArrayParam - Class in org.apache.spark.ml.param
-
Developer API
Specialized version of Param[Array[Double}
for Java.
- DoubleArrayParam(Params, String, String, Function1<double[], Object>) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
-
- DoubleArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
-
- DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more records of type Double from each input record.
- DoubleFunction<T> - Interface in org.apache.spark.api.java.function
-
A function that returns Doubles, and can be used to construct DoubleRDDs.
- DoubleParam - Class in org.apache.spark.ml.param
-
Developer API
Specialized version of Param[Double]
for Java.
- DoubleParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleParam(String, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleRDDFunctions - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of Doubles through an implicit conversion.
- DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
-
- doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.rdd.RDD
-
- DoubleType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the DoubleType object.
- DoubleType - Class in org.apache.spark.sql.types
-
The data type representing Double
values.
- DoubleType() - Constructor for class org.apache.spark.sql.types.DoubleType
-
- driver() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SetupDriver
-
- DRIVER_EXTRA_CLASSPATH - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Configuration key for the driver class path.
- DRIVER_EXTRA_JAVA_OPTIONS - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Configuration key for the driver VM options.
- DRIVER_EXTRA_LIBRARY_PATH - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Configuration key for the driver native library path.
- DRIVER_MEMORY - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Configuration key for the driver memory.
- DRIVER_WAL_BATCHING_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- DRIVER_WAL_BATCHING_TIMEOUT_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- DRIVER_WAL_CLASS_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- DRIVER_WAL_CLOSE_AFTER_WRITE_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- DRIVER_WAL_MAX_FAILURES_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- DRIVER_WAL_ROLLING_INTERVAL_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- driverLogs() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- drop() - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that drops rows containing any null or NaN values.
- drop(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that drops rows containing null or NaN values.
- drop(String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that drops rows containing any null or NaN values
in the specified columns.
- drop(Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new DataFrame
that drops rows containing any null or NaN values
in the specified columns.
- drop(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that drops rows containing null or NaN values
in the specified columns.
- drop(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new DataFrame
that drops rows containing null or NaN values
in the specified columns.
- drop(int) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that drops rows containing
less than minNonNulls
non-null and non-NaN values.
- drop(int, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that drops rows containing
less than minNonNulls
non-null and non-NaN values in the specified columns.
- drop(int, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new DataFrame
that drops rows containing less than
minNonNulls
non-null and non-NaN values in the specified columns.
- drop(String...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with columns dropped.
- drop(String) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with a column dropped.
- drop(Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with columns dropped.
- drop(Column) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with a column dropped.
- dropDatabase(String, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Drop the specified database, if it exists.
- dropDuplicates(String, String...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new
Dataset
with duplicate rows removed, considering only
the subset of columns.
- dropDuplicates() - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset that contains only the unique rows from this Dataset.
- dropDuplicates(Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
(Scala-specific) Returns a new Dataset with duplicate rows removed, considering only
the subset of columns.
- dropDuplicates(String[]) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with duplicate rows removed, considering only
the subset of columns.
- dropDuplicates(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new
Dataset
with duplicate rows removed, considering only
the subset of columns.
- dropFromMemory(BlockId, Function0<Either<Object, org.apache.spark.util.io.ChunkedByteBuffer>>, ClassTag<T>) - Method in interface org.apache.spark.storage.memory.BlockEvictionHandler
-
Drop a block from memory, possibly putting it on disk if applicable.
- dropFunction(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Drop an existing function in the database.
- dropGlobalTempView(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Drops the global temporary view with the given view name in the catalog.
- dropLast() - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
Whether to drop the last category in the encoded vector (default: true)
- dropLast() - Method in interface org.apache.spark.ml.feature.OneHotEncoderBase
-
Whether to drop the last category in the encoded vector (default: true)
- dropPartitions(String, String, Seq<Map<String, String>>, boolean, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Drop one or many partitions in the given table, assuming they exist.
- dropTable(String, String, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Drop the specified table.
- dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
-
- dropTempView(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Drops the local temporary view with the given view name in the catalog.
- dspmv(int, double, DenseVector, DenseVector, double, DenseVector) - Static method in class org.apache.spark.ml.linalg.BLAS
-
y := alpha*A*x + beta*y
- Dst - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose the destination and edge fields but not the source field.
- dstAttr() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex attribute of the edge's destination vertex.
- dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
-
The destination vertex attribute
- dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- dstCol() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
-
Name of the input column for destination vertex IDs.
- dstId() - Method in class org.apache.spark.graphx.Edge
-
- dstId() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex id of the edge's destination vertex.
- dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
- dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- DStream<T> - Class in org.apache.spark.streaming.dstream
-
A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous
sequence of RDDs (of the same type) representing a continuous stream of data (see
org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
- DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
-
- dtypes() - Method in class org.apache.spark.sql.Dataset
-
Returns all column names and their data types as an array.
- DummySerializerInstance - Class in org.apache.spark.serializer
-
Unfortunately, we need a serializer instance in order to construct a DiskBlockObjectWriter.
- duration() - Method in class org.apache.spark.scheduler.TaskInfo
-
- duration() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- duration() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
-
- duration() - Method in class org.apache.spark.status.api.v1.TaskData
-
- DURATION() - Static method in class org.apache.spark.status.TaskIndexNames
-
- Duration - Class in org.apache.spark.streaming
-
- Duration(long) - Constructor for class org.apache.spark.streaming.Duration
-
- duration() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-
Return the duration of this output operation.
- durationMs() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
- Durations - Class in org.apache.spark.streaming
-
- Durations() - Constructor for class org.apache.spark.streaming.Durations
-
- f() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
-
- f1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns document-based f1-measure averaged by the number of documents
- f1Measure(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns f1-measure for a given label (category)
- factorial(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the factorial of the given value.
- failed() - Method in class org.apache.spark.scheduler.TaskInfo
-
- FAILED() - Static method in class org.apache.spark.TaskState
-
- failedStages() - Method in class org.apache.spark.status.LiveJob
-
- failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- failedTasks() - Method in class org.apache.spark.status.LiveExecutor
-
- failedTasks() - Method in class org.apache.spark.status.LiveExecutorStageSummary
-
- failedTasks() - Method in class org.apache.spark.status.LiveJob
-
- failedTasks() - Method in class org.apache.spark.status.LiveStage
-
- failure(String) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- failureReason() - Method in class org.apache.spark.scheduler.StageInfo
-
If the stage failed, the reason why.
- failureReason() - Method in class org.apache.spark.status.api.v1.StageData
-
- failureReason() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
-
- failureReason() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-
- failureReasonCell(String, int, boolean) - Static method in class org.apache.spark.streaming.ui.UIUtils
-
- FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- FAKE_HIVE_VERSION() - Static method in class org.apache.spark.sql.hive.HiveUtils
-
- FalsePositiveRate - Class in org.apache.spark.mllib.evaluation.binary
-
False positive rate.
- FalsePositiveRate() - Constructor for class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
-
- falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns false positive rate for a given label (category)
- falsePositiveRateByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Returns false positive rate for each label (category).
- family() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
Param for the name of family which is a description of the label distribution
to be used in the model.
- family() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
Param for the name of family which is a description of the error distribution
to be used in the model.
- Family$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Family$
-
- FamilyAndLink$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.FamilyAndLink$
-
- fdr() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
The upper bound of the expected false discovery rate.
- fdr() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- feature() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data
-
- feature() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- feature() - Method in class org.apache.spark.mllib.tree.model.Split
-
- FeatureHasher - Class in org.apache.spark.ml.feature
-
Feature hashing projects a set of categorical or numerical features into a feature vector of
specified dimension (typically substantially smaller than that of the original feature
space).
- FeatureHasher(String) - Constructor for class org.apache.spark.ml.feature.FeatureHasher
-
- FeatureHasher() - Constructor for class org.apache.spark.ml.feature.FeatureHasher
-
- featureImportances() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
Estimate of the importance of each feature.
- featureImportances() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
Estimate of the importance of each feature.
- featureImportances() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
Estimate of the importance of each feature.
- featureImportances() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
Estimate of the importance of each feature.
- featureImportances() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
Estimate of the importance of each feature.
- featureImportances() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
Estimate of the importance of each feature.
- featureIndex() - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase
-
Param for the index of the feature if featuresCol
is a vector column (default: 0
), no
effect otherwise.
- featureIndex() - Method in class org.apache.spark.ml.tree.CategoricalSplit
-
- featureIndex() - Method in class org.apache.spark.ml.tree.ContinuousSplit
-
- featureIndex() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData
-
- featureIndex() - Method in interface org.apache.spark.ml.tree.Split
-
Index of feature which this split tests
- features() - Method in class org.apache.spark.ml.feature.LabeledPoint
-
- features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- featuresCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Field in "predictions" which gives the features of each instance as a vector.
- featuresCol() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
-
- featuresCol() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
-
- featuresCol() - Method in interface org.apache.spark.ml.param.shared.HasFeaturesCol
-
Param for features column name.
- featuresCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
- featureSubsetStrategy() - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams
-
The number of features to consider for splits at each tree node.
- featureSum() - Method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats
-
- FeatureType - Class in org.apache.spark.mllib.tree.configuration
-
Enum to describe whether a feature is "continuous" or "categorical"
- FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
-
- featureType() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- featureType() - Method in class org.apache.spark.mllib.tree.model.Split
-
- FETCH_WAIT_TIME() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
-
- FetchFailed - Class in org.apache.spark
-
Developer API
Task failed to fetch shuffle data from a remote node.
- FetchFailed(BlockManagerId, int, int, int, String) - Constructor for class org.apache.spark.FetchFailed
-
- fetchFile(String, File, SparkConf, org.apache.spark.SecurityManager, Configuration, long, boolean) - Static method in class org.apache.spark.util.Utils
-
Download a file or directory to target directory.
- fetchPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
-
- fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- field() - Method in class org.apache.spark.storage.BroadcastBlockId
-
- fieldIndex(String) - Method in interface org.apache.spark.sql.Row
-
Returns the index of a given field name.
- fieldIndex(String) - Method in class org.apache.spark.sql.types.StructType
-
Returns the index of a given field.
- fieldNames() - Method in class org.apache.spark.sql.types.StructType
-
Returns all field names in an array.
- fields() - Method in class org.apache.spark.sql.types.StructType
-
- FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- FILE_FORMAT() - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- FileBasedTopologyMapper - Class in org.apache.spark.storage
-
A simple file based topology mapper.
- FileBasedTopologyMapper(SparkConf) - Constructor for class org.apache.spark.storage.FileBasedTopologyMapper
-
- FileCommitProtocol - Class in org.apache.spark.internal.io
-
An interface to define how a single Spark job commits its outputs.
- FileCommitProtocol() - Constructor for class org.apache.spark.internal.io.FileCommitProtocol
-
- FileCommitProtocol.EmptyTaskCommitMessage$ - Class in org.apache.spark.internal.io
-
- FileCommitProtocol.TaskCommitMessage - Class in org.apache.spark.internal.io
-
- fileFormat() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- files() - Method in class org.apache.spark.SparkContext
-
- fileStream(String, Class<K>, Class<V>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Function1<Path, Object>, boolean, Configuration, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fill(long) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that replaces null or NaN values in numeric columns with value
.
- fill(double) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that replaces null or NaN values in numeric columns with value
.
- fill(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that replaces null values in string columns with value
.
- fill(long, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that replaces null or NaN values in specified numeric columns.
- fill(double, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that replaces null or NaN values in specified numeric columns.
- fill(long, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new DataFrame
that replaces null or NaN values in specified
numeric columns.
- fill(double, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new DataFrame
that replaces null or NaN values in specified
numeric columns.
- fill(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that replaces null values in specified string columns.
- fill(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new DataFrame
that replaces null values in
specified string columns.
- fill(boolean) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that replaces null values in boolean columns with value
.
- fill(boolean, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new DataFrame
that replaces null values in specified
boolean columns.
- fill(boolean, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that replaces null values in specified boolean columns.
- fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new DataFrame
that replaces null values.
- fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new DataFrame
that replaces null values.
- filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function1<Graph<VD, ED>, Graph<VD2, ED2>>, Function1<EdgeTriplet<VD2, ED2>, Object>, Function2<Object, VD2, Object>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.GraphOps
-
Filter the graph by computing some values to filter on, and applying the predicates.
- filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- filter(Function1<Tuple2<Object, VD>, Object>) - Method in class org.apache.spark.graphx.VertexRDD
-
Restricts the vertex set to the set of vertices satisfying the given predicate.
- filter(Params) - Method in class org.apache.spark.ml.param.ParamMap
-
Filters this param map for the given parent.
- filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Column) - Method in class org.apache.spark.sql.Dataset
-
Filters rows using the given condition.
- filter(String) - Method in class org.apache.spark.sql.Dataset
-
Filters rows using the given SQL expression.
- filter(Function1<T, Object>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Scala-specific)
Returns a new Dataset that only contains elements where func
returns true
.
- filter(FilterFunction<T>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Java-specific)
Returns a new Dataset that only contains elements where func
returns true
.
- Filter - Class in org.apache.spark.sql.sources
-
A filter predicate for data sources.
- Filter() - Constructor for class org.apache.spark.sql.sources.Filter
-
- filter() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
-
- filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filterByRange(K, K) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
-
Returns an RDD containing only the elements in the inclusive range lower
to upper
.
- FilterFunction<T> - Interface in org.apache.spark.api.java.function
-
Base interface for a function used in Dataset's filter function.
- filterName() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
-
- filterParams() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
-
- finalStorageLevel() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for StorageLevel for ALS model factors.
- findAccessedFields(SerializedLambda, ClassLoader, Map<Class<?>, Set<String>>, boolean) - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
Scans an indylambda Scala closure, along with its lexically nested closures, and populate
the accessed fields info on which fields on the outer object are accessed.
- findClass(String) - Method in class org.apache.spark.util.ParentClassLoader
-
- findFrequentSequentialPatterns(Dataset<?>) - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
Experimental
Finds the complete set of frequent sequential patterns in the input sequences of itemsets.
- findListenersByClass(ClassTag<T>) - Method in interface org.apache.spark.util.ListenerBus
-
- findMissingPartitions() - Method in class org.apache.spark.ShuffleStatus
-
Returns the sequence of partition ids that are missing (i.e.
- findSynonyms(String, int) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
Find "num" number of words closest in similarity to the given word, not
including the word itself.
- findSynonyms(Vector, int) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
Find "num" number of words whose vector representation is most similar to the supplied vector.
- findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Find synonyms of a word; do not include the word itself in results.
- findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Find synonyms of the vector representation of a word, possibly
including any words in the model vocabulary whose vector respresentation
is the supplied vector.
- findSynonymsArray(Vector, int) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
Find "num" number of words whose vector representation is most similar to the supplied vector.
- findSynonymsArray(String, int) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
Find "num" number of words closest in similarity to the given word, not
including the word itself.
- finish(BUF) - Method in class org.apache.spark.sql.expressions.Aggregator
-
Transform the output of the reduction.
- finished() - Method in class org.apache.spark.scheduler.TaskInfo
-
- FINISHED() - Static method in class org.apache.spark.TaskState
-
- finishTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
The time when the task has completed successfully (including the time to remotely fetch
results, if necessary).
- first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- first() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- first() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the first element in this RDD.
- first() - Method in class org.apache.spark.rdd.RDD
-
Return the first element in this RDD.
- first() - Method in class org.apache.spark.sql.Dataset
-
Returns the first row.
- first(Column, boolean) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the first value in a group.
- first(String, boolean) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the first value of a column in a group.
- first(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the first value in a group.
- first(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the first value of a column in a group.
- firstFailureReason() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
-
- firstLaunchTime() - Method in class org.apache.spark.status.LiveStage
-
- firstTaskLaunchedTime() - Method in class org.apache.spark.status.api.v1.StageData
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.clustering.KMeans
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.clustering.LDA
-
- fit(Dataset<?>, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator
-
Fits a single model to the input data with optional parameters.
- fit(Dataset<?>, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator
-
Fits a single model to the input data with optional parameters.
- fit(Dataset<?>, ParamMap) - Method in class org.apache.spark.ml.Estimator
-
Fits a single model to the input data with provided parameter map.
- fit(Dataset<?>) - Method in class org.apache.spark.ml.Estimator
-
Fits a model to the input data.
- fit(Dataset<?>, ParamMap[]) - Method in class org.apache.spark.ml.Estimator
-
Fits multiple models to the input data with multiple sets of parameters.
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.IDF
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.Imputer
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.PCA
-
Computes a
PCAModel
that contains the principal components of the input vectors.
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.RFormula
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.fpm.FPGrowth
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.Pipeline
-
Fits the pipeline to the input dataset with additional parameters.
- fit(Dataset<?>) - Method in class org.apache.spark.ml.Predictor
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.recommendation.ALS
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- fit(Dataset<?>) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- fit(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
Returns a ChiSquared feature selector.
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
-
Computes the inverse document frequency.
- fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
-
Computes the inverse document frequency.
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA
-
Computes a
PCAModel
that contains the principal components of the input vectors.
- fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA
-
Java-friendly version of fit()
.
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler
-
Computes the mean and variance and stores as a model to be used for later scaling.
- fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Computes the vector representation of each word in vocabulary.
- fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Computes the vector representation of each word in vocabulary (Java version).
- fitIntercept() - Method in interface org.apache.spark.ml.param.shared.HasFitIntercept
-
Param for whether to fit an intercept term.
- Fixed$() - Constructor for class org.apache.spark.sql.types.DecimalType.Fixed$
-
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMap(Function1<T, TraversableOnce<U>>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Scala-specific)
Returns a new Dataset by first applying a function to all elements of this Dataset,
and then flattening the results.
- flatMap(FlatMapFunction<T, U>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Java-specific)
Returns a new Dataset by first applying a function to all elements of this Dataset,
and then flattening the results.
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more output records from each input record.
- FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function
-
A function that takes two inputs and returns zero or more output records.
- flatMapGroups(Function2<K, Iterator<V>, TraversableOnce<U>>, Encoder<U>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
(Scala-specific)
Applies the given function to each group of data.
- flatMapGroups(FlatMapGroupsFunction<K, V, U>, Encoder<U>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
(Java-specific)
Applies the given function to each group of data.
- FlatMapGroupsFunction<K,V,R> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more output records from each grouping key and its values.
- flatMapGroupsWithState(OutputMode, GroupStateTimeout, Function3<K, Iterator<V>, GroupState<S>, Iterator<U>>, Encoder<S>, Encoder<U>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
::Experimental::
(Scala-specific)
Applies the given function to each group of data, while maintaining a user-defined per-group
state.
- flatMapGroupsWithState(FlatMapGroupsWithStateFunction<K, V, S, U>, OutputMode, Encoder<S>, Encoder<U>, GroupStateTimeout) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
::Experimental::
(Java-specific)
Applies the given function to each group of data, while maintaining a user-defined per-group
state.
- FlatMapGroupsWithStateFunction<K,V,S,R> - Interface in org.apache.spark.api.java.function
-
::Experimental::
Base interface for a map function used in
org.apache.spark.sql.KeyValueGroupedDataset.flatMapGroupsWithState(
FlatMapGroupsWithStateFunction, org.apache.spark.sql.streaming.OutputMode,
org.apache.spark.sql.Encoder, org.apache.spark.sql.Encoder)
- flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Pass each value in the key-value pair RDD through a flatMap function without changing the
keys; this also retains the original RDD's partitioning.
- flatMapValues(Function1<V, TraversableOnce<U>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Pass each value in the key-value pair RDD through a flatMap function without changing the
keys; this also retains the original RDD's partitioning.
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
'this' DStream without changing the key.
- flatMapValues(Function1<V, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
'this' DStream without changing the key.
- flatten(Column) - Static method in class org.apache.spark.sql.functions
-
Creates a single array from an array of arrays.
- FLOAT() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for nullable float type.
- FloatAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-
Deprecated.
- FloatParam - Class in org.apache.spark.ml.param
-
Developer API
Specialized version of Param[Float]
for Java.
- FloatParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- FloatParam(String, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- FloatParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- FloatParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- FloatType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the FloatType object.
- FloatType - Class in org.apache.spark.sql.types
-
The data type representing Float
values.
- FloatType() - Constructor for class org.apache.spark.sql.types.FloatType
-
- floor(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the floor of the given value.
- floor(String) - Static method in class org.apache.spark.sql.functions
-
Computes the floor of the given column.
- floor() - Method in class org.apache.spark.sql.types.Decimal
-
- floor(Duration) - Method in class org.apache.spark.streaming.Time
-
- floor(Duration, Time) - Method in class org.apache.spark.streaming.Time
-
- flush() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
-
- flush() - Method in class org.apache.spark.serializer.SerializationStream
-
- flush() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
-
- fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f-measure for a given label (category)
- fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f1-measure for a given label (category)
- fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
- fMeasureByLabel(double) - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Returns f-measure for each label (category).
- fMeasureByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Returns f1-measure for each label (category).
- fMeasureByThreshold() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-
Returns a dataframe with two fields (threshold, F-Measure) curve with beta = 1.0.
- fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, F-Measure) curve.
- fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, F-Measure) curve with beta = 1.0.
- fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregate the elements of each partition, and then the results for all the partitions, using a
given associative function and a neutral "zero value".
- fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
-
Aggregate the elements of each partition, and then the results for all the partitions, using a
given associative function and a neutral "zero value".
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value"
which may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- forceIndexLabel() - Method in interface org.apache.spark.ml.feature.RFormulaBase
-
Force to index label whether it is numeric or string type.
- foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Applies a function f to all elements of this RDD.
- foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies a function f to all elements of this RDD.
- foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.sql.Dataset
-
Applies a function f
to all rows.
- foreach(ForeachFunction<T>) - Method in class org.apache.spark.sql.Dataset
-
(Java-specific)
Runs func
on each element of this Dataset.
- foreach(ForeachWriter<T>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter
-
Sets the output of the streaming query to be processed using the provided writer object.
- foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in class org.apache.spark.ml.linalg.DenseMatrix
-
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.ml.linalg.DenseVector
-
- foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Applies a function f
to all the active elements of dense and sparse matrix.
- foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in class org.apache.spark.ml.linalg.SparseMatrix
-
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.ml.linalg.SparseVector
-
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.ml.linalg.Vector
-
Applies a function f
to all the active elements of dense and sparse vector.
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Applies a function f
to all the active elements of dense and sparse matrix.
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Vector
-
Applies a function f
to all the active elements of dense and sparse vector.
- foreachAsync(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of the foreach
action, which
applies a function f to all the elements of this RDD.
- foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Applies a function f to all elements of this RDD.
- foreachBatch(Function2<Dataset<T>, Object, BoxedUnit>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter
-
Experimental
- foreachBatch(VoidFunction2<Dataset<T>, Long>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter
-
Experimental
- ForeachFunction<T> - Interface in org.apache.spark.api.java.function
-
Base interface for a function used in Dataset's foreach function.
- foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Applies a function f to each partition of this RDD.
- foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies a function f to each partition of this RDD.
- foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.sql.Dataset
-
Applies a function f
to each partition of this Dataset.
- foreachPartition(ForeachPartitionFunction<T>) - Method in class org.apache.spark.sql.Dataset
-
(Java-specific)
Runs func
on each partition of this Dataset.
- foreachPartitionAsync(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of the foreachPartition
action, which
applies a function f to each partition of this RDD.
- foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Applies a function f to each partition of this RDD.
- ForeachPartitionFunction<T> - Interface in org.apache.spark.api.java.function
-
Base interface for a function used in Dataset's foreachPartition function.
- foreachRDD(VoidFunction<R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Apply a function to each RDD in this DStream.
- foreachRDD(VoidFunction2<R, Time>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- ForeachWriter<T> - Class in org.apache.spark.sql
-
The abstract class for writing custom logic to process data generated by a query.
- ForeachWriter() - Constructor for class org.apache.spark.sql.ForeachWriter
-
- format() - Method in class org.apache.spark.ml.clustering.InternalKMeansModelWriter
-
- format() - Method in class org.apache.spark.ml.clustering.PMMLKMeansModelWriter
-
- format() - Method in class org.apache.spark.ml.regression.InternalLinearRegressionModelWriter
-
- format() - Method in class org.apache.spark.ml.regression.PMMLLinearRegressionModelWriter
-
- format(String) - Method in class org.apache.spark.ml.util.GeneralMLWriter
-
Specifies the format of ML export (e.g.
- format() - Method in interface org.apache.spark.ml.util.MLFormatRegister
-
The string that represents the format that this format provider uses.
- format(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Specifies the input data source format.
- format(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Specifies the underlying output data source.
- format(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader
-
Specifies the input data source format.
- format(String) - Method in class org.apache.spark.sql.streaming.DataStreamWriter
-
Specifies the underlying output data source.
- format_number(Column, int) - Static method in class org.apache.spark.sql.functions
-
Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places
with HALF_EVEN round mode, and returns the result as a string column.
- format_string(String, Column...) - Static method in class org.apache.spark.sql.functions
-
Formats the arguments in printf-style and returns the result as a string column.
- format_string(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Formats the arguments in printf-style and returns the result as a string column.
- formatBatchTime(long, long, boolean, TimeZone) - Static method in class org.apache.spark.streaming.ui.UIUtils
-
If batchInterval
is less than 1 second, format batchTime
with milliseconds.
- formatDate(Date) - Static method in class org.apache.spark.ui.UIUtils
-
- formatDate(long) - Static method in class org.apache.spark.ui.UIUtils
-
- formatDuration(long) - Static method in class org.apache.spark.ui.UIUtils
-
- formatDurationVerbose(long) - Static method in class org.apache.spark.ui.UIUtils
-
Generate a verbose human-readable string representing a duration such as "5 second 35 ms"
- formatNumber(double) - Static method in class org.apache.spark.ui.UIUtils
-
Generate a human-readable string representing a number (e.g.
- formatVersion() - Method in interface org.apache.spark.mllib.util.Saveable
-
Current version of model save/load format.
- formula() - Method in interface org.apache.spark.ml.feature.RFormulaBase
-
R formula parameter.
- forward(DenseMatrix<Object>, boolean) - Method in interface org.apache.spark.ml.ann.TopologyModel
-
Forward propagation
- FPGrowth - Class in org.apache.spark.ml.fpm
-
Experimental
A parallel FP-growth algorithm to mine frequent itemsets.
- FPGrowth(String) - Constructor for class org.apache.spark.ml.fpm.FPGrowth
-
- FPGrowth() - Constructor for class org.apache.spark.ml.fpm.FPGrowth
-
- FPGrowth - Class in org.apache.spark.mllib.fpm
-
A parallel FP-growth algorithm to mine frequent itemsets.
- FPGrowth() - Constructor for class org.apache.spark.mllib.fpm.FPGrowth
-
Constructs a default instance with default parameters {minSupport: 0.3
, numPartitions: same
as the input data}.
- FPGrowth.FreqItemset<Item> - Class in org.apache.spark.mllib.fpm
-
Frequent itemset.
- FPGrowthModel - Class in org.apache.spark.ml.fpm
-
Experimental
Model fitted by FPGrowth.
- FPGrowthModel<Item> - Class in org.apache.spark.mllib.fpm
-
Model trained by
FPGrowth
, which holds frequent itemsets.
- FPGrowthModel(RDD<FPGrowth.FreqItemset<Item>>, Map<Item, Object>, ClassTag<Item>) - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel
-
- FPGrowthModel(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel
-
- FPGrowthModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.fpm
-
- FPGrowthParams - Interface in org.apache.spark.ml.fpm
-
Common params for FPGrowth and FPGrowthModel
- fpr() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
The highest p-value for features to be kept.
- fpr() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- freq() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-
- freq() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
-
- freqItems(String[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Finding frequent items for columns, possibly with false positives.
- freqItems(String[]) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Finding frequent items for columns, possibly with false positives.
- freqItems(Seq<String>, double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
(Scala-specific) Finding frequent items for columns, possibly with false positives.
- freqItems(Seq<String>) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
(Scala-specific) Finding frequent items for columns, possibly with false positives.
- FreqItemset(Object, long) - Constructor for class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-
- freqItemsets() - Method in class org.apache.spark.ml.fpm.FPGrowthModel
-
- freqItemsets() - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
-
- FreqSequence(Object[], long) - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
-
- freqSequences() - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel
-
- from_json(Column, StructType, Map<String, String>) - Static method in class org.apache.spark.sql.functions
-
(Scala-specific) Parses a column containing a JSON string into a StructType
with the
specified schema.
- from_json(Column, DataType, Map<String, String>) - Static method in class org.apache.spark.sql.functions
-
(Scala-specific) Parses a column containing a JSON string into a MapType
with StringType
as keys type, StructType
or ArrayType
with the specified schema.
- from_json(Column, StructType, Map<String, String>) - Static method in class org.apache.spark.sql.functions
-
(Java-specific) Parses a column containing a JSON string into a StructType
with the
specified schema.
- from_json(Column, DataType, Map<String, String>) - Static method in class org.apache.spark.sql.functions
-
(Java-specific) Parses a column containing a JSON string into a MapType
with StringType
as keys type, StructType
or ArrayType
with the specified schema.
- from_json(Column, StructType) - Static method in class org.apache.spark.sql.functions
-
Parses a column containing a JSON string into a StructType
with the specified schema.
- from_json(Column, DataType) - Static method in class org.apache.spark.sql.functions
-
Parses a column containing a JSON string into a MapType
with StringType
as keys type,
StructType
or ArrayType
with the specified schema.
- from_json(Column, String, Map<String, String>) - Static method in class org.apache.spark.sql.functions
-
(Java-specific) Parses a column containing a JSON string into a MapType
with StringType
as keys type, StructType
or ArrayType
with the specified schema.
- from_json(Column, String, Map<String, String>) - Static method in class org.apache.spark.sql.functions
-
(Scala-specific) Parses a column containing a JSON string into a MapType
with StringType
as keys type, StructType
or ArrayType
with the specified schema.
- from_json(Column, Column) - Static method in class org.apache.spark.sql.functions
-
(Scala-specific) Parses a column containing a JSON string into a MapType
with StringType
as keys type, StructType
or ArrayType
of StructType
s with the specified schema.
- from_json(Column, Column, Map<String, String>) - Static method in class org.apache.spark.sql.functions
-
(Java-specific) Parses a column containing a JSON string into a MapType
with StringType
as keys type, StructType
or ArrayType
of StructType
s with the specified schema.
- from_unixtime(Column) - Static method in class org.apache.spark.sql.functions
-
Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string
representing the timestamp of that moment in the current system time zone in the
yyyy-MM-dd HH:mm:ss format.
- from_unixtime(Column, String) - Static method in class org.apache.spark.sql.functions
-
Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string
representing the timestamp of that moment in the current system time zone in the given
format.
- from_utc_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions
-
Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders
that time as a timestamp in the given time zone.
- from_utc_timestamp(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders
that time as a timestamp in the given time zone.
- fromCOO(int, int, Iterable<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.ml.linalg.SparseMatrix
-
Generate a SparseMatrix
from Coordinate List (COO) format.
- fromCOO(int, int, Iterable<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a SparseMatrix
from Coordinate List (COO) format.
- fromDDL(String) - Static method in class org.apache.spark.sql.types.DataType
-
- fromDDL(String) - Static method in class org.apache.spark.sql.types.StructType
-
Creates StructType for a given DDL-formatted string, which is a comma separated list of field
definitions, e.g., a INT, b STRING.
- fromDecimal(Object) - Static method in class org.apache.spark.sql.types.Decimal
-
- fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
-
- fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from EdgePartitions, setting referenced vertices to defaultVertexAttr
.
- fromEdges(RDD<Edge<ED>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD
-
Creates an EdgeRDD from a set of edges.
- fromEdges(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of edges.
- fromEdges(EdgeRDD<?>, int, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
containing all vertices referred to in edges
.
- fromEdgeTuples(RDD<Tuple2<Object, Object>>, VD, Option<PartitionStrategy>, StorageLevel, StorageLevel, ClassTag<VD>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of edges encoded as vertex id pairs.
- fromExistingRDDs(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the
vertices.
- fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
-
- fromInt(int) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
-
- fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
Convert a JavaRDD of key-value pairs to JavaPairRDD.
- fromJson(String) - Static method in class org.apache.spark.ml.linalg.JsonMatrixConverter
-
Parses the JSON representation of a Matrix into a
Matrix
.
- fromJson(String) - Static method in class org.apache.spark.ml.linalg.JsonVectorConverter
-
Parses the JSON representation of a vector into a
Vector
.
- fromJson(String) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Parses the JSON representation of a vector into a
Vector
.
- fromJson(String) - Static method in class org.apache.spark.sql.types.DataType
-
- fromJson(String) - Static method in class org.apache.spark.sql.types.Metadata
-
Creates a Metadata instance from JSON.
- fromKinesisInitialPosition(InitialPositionInStream) - Static method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions
-
Returns instance of [[KinesisInitialPosition]] based on the passed
[[InitialPositionInStream]].
- fromMetadata(Metadata) - Method in interface org.apache.spark.ml.attribute.AttributeFactory
-
Creates an
Attribute
from a
Metadata
instance.
- fromML(DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Convert new linalg type to spark.mllib type.
- fromML(DenseVector) - Static method in class org.apache.spark.mllib.linalg.DenseVector
-
Convert new linalg type to spark.mllib type.
- fromML(Matrix) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Convert new linalg type to spark.mllib type.
- fromML(SparseMatrix) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Convert new linalg type to spark.mllib type.
- fromML(SparseVector) - Static method in class org.apache.spark.mllib.linalg.SparseVector
-
Convert new linalg type to spark.mllib type.
- fromML(Vector) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Convert new linalg type to spark.mllib type.
- fromName(String) - Static method in class org.apache.spark.ml.attribute.AttributeType
-
- fromNullable(T) - Static method in class org.apache.spark.api.java.Optional
-
- fromOld(Node, Map<Object, Object>) - Static method in class org.apache.spark.ml.tree.Node
-
Create a new Node from the old Node format, recursively creating child nodes as needed.
- fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- fromPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions
-
Implicit conversion from a pair RDD to MLPairRDDFunctions.
- fromParams(GeneralizedLinearRegressionBase) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Family$
-
Gets the Family
object based on param family and variancePower.
- fromParams(GeneralizedLinearRegressionBase) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Link$
-
Gets the Link
object based on param family, link and linkPower.
- fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-
- fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-
- fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RDDFunctions
-
Implicit conversion from an RDD to RDDFunctions.
- fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
-
- fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
-
- fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-
- fromStage(Stage, int, Option<Object>, TaskMetrics, Seq<Seq<TaskLocation>>) - Static method in class org.apache.spark.scheduler.StageInfo
-
Construct a StageInfo from a Stage.
- fromString(String) - Static method in enum org.apache.spark.JobExecutionStatus
-
- fromString(String) - Static method in class org.apache.spark.mllib.tree.impurity.Impurities
-
- fromString(String) - Static method in class org.apache.spark.mllib.tree.loss.Losses
-
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.ApplicationStatus
-
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.StageStatus
-
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.streaming.BatchStatus
-
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.TaskSorting
-
- fromString(String) - Static method in class org.apache.spark.storage.StorageLevel
-
Developer API
Return the StorageLevel object with the specified name.
- fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.Attribute
-
- fromStructField(StructField) - Method in interface org.apache.spark.ml.attribute.AttributeFactory
-
Creates an
Attribute
from a
StructField
instance.
- fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.AttributeGroup
-
Creates an attribute group from a StructField
instance.
- fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.BinaryAttribute
-
- fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.NominalAttribute
-
- fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.NumericAttribute
-
- fullOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a full outer join of this
and other
.
- fullOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a full outer join of this
and other
.
- fullOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a full outer join of this
and other
.
- fullOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a full outer join of this
and other
.
- fullOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a full outer join of this
and other
.
- fullOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a full outer join of this
and other
.
- fullOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullStackTrace() - Method in class org.apache.spark.ExceptionFailure
-
- Function<T1,R> - Interface in org.apache.spark.api.java.function
-
Base interface for functions whose return types do not create special RDDs.
- Function - Class in org.apache.spark.sql.catalog
-
A user-defined function in Spark, as returned by
listFunctions
method in
Catalog
.
- Function(String, String, String, String, boolean) - Constructor for class org.apache.spark.sql.catalog.Function
-
- function(Function4<Time, KeyType, Option<ValueType>, State<StateType>, Option<MappedType>>) - Static method in class org.apache.spark.streaming.StateSpec
-
- function(Function3<KeyType, Option<ValueType>, State<StateType>, MappedType>) - Static method in class org.apache.spark.streaming.StateSpec
-
- function(Function4<Time, KeyType, Optional<ValueType>, State<StateType>, Optional<MappedType>>) - Static method in class org.apache.spark.streaming.StateSpec
-
- function(Function3<KeyType, Optional<ValueType>, State<StateType>, MappedType>) - Static method in class org.apache.spark.streaming.StateSpec
-
- Function0<R> - Interface in org.apache.spark.api.java.function
-
A zero-argument function that returns an R.
- Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function
-
A two-argument function that takes arguments of type T1 and T2 and returns an R.
- Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function
-
A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
- Function4<T1,T2,T3,T4,R> - Interface in org.apache.spark.api.java.function
-
A four-argument function that takes arguments of type T1, T2, T3 and T4 and returns an R.
- functionExists(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Check if the function with the specified name exists.
- functionExists(String, String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Check if the function with the specified name exists in the specified database.
- functionExists(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Return whether a function exists in the specified database.
- functions - Class in org.apache.spark.sql
-
Commonly used functions available for DataFrame operations.
- functions() - Constructor for class org.apache.spark.sql.functions
-
- FutureAction<T> - Interface in org.apache.spark
-
A future for the result of an action to support cancellation.
- futureExecutionContext() - Static method in class org.apache.spark.rdd.AsyncRDDActions
-
- fwe() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
The upper bound of the expected family-wise error rate.
- fwe() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- gain() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
-
- gain() - Method in class org.apache.spark.ml.tree.InternalNode
-
- gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- Gamma$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
-
- gamma1() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- gamma2() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- gamma6() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- gamma7() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- GammaGenerator - Class in org.apache.spark.mllib.random
-
Developer API
Generates i.i.d.
- GammaGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.GammaGenerator
-
- gammaJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Java-friendly version of RandomRDDs.gammaRDD
.
- gammaJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.gammaJavaRDD
with the default seed.
- gammaJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.gammaJavaRDD
with the default number of partitions and the default seed.
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Java-friendly version of RandomRDDs.gammaVectorRDD
.
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.gammaJavaVectorRDD
with the default seed.
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.gammaJavaVectorRDD
with the default number of partitions and the default seed.
- gammaRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the gamma distribution with the input
shape and scale.
- gammaVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from the
gamma distribution with the input shape and scale.
- gapply(RelationalGroupedDataset, byte[], byte[], Object[], StructType) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
The helper function for gapply() on R side.
- gaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
Indicates whether regex splits on gaps (true) or matches tokens (false).
- GAUGE() - Static method in class org.apache.spark.metrics.sink.StatsdMetricType
-
- Gaussian$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
-
- GaussianMixture - Class in org.apache.spark.ml.clustering
-
Gaussian Mixture clustering.
- GaussianMixture(String) - Constructor for class org.apache.spark.ml.clustering.GaussianMixture
-
- GaussianMixture() - Constructor for class org.apache.spark.ml.clustering.GaussianMixture
-
- GaussianMixture - Class in org.apache.spark.mllib.clustering
-
This class performs expectation maximization for multivariate Gaussian
Mixture Models (GMMs).
- GaussianMixture() - Constructor for class org.apache.spark.mllib.clustering.GaussianMixture
-
Constructs a default instance.
- GaussianMixtureModel - Class in org.apache.spark.ml.clustering
-
Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points
are drawn from each Gaussian i with probability weights(i).
- GaussianMixtureModel - Class in org.apache.spark.mllib.clustering
-
Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points
are drawn from each Gaussian i=1..k with probability w(i); mu(i) and sigma(i) are
the respective mean and covariance for each Gaussian distribution i=1..k.
- GaussianMixtureModel(double[], MultivariateGaussian[]) - Constructor for class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- GaussianMixtureParams - Interface in org.apache.spark.ml.clustering
-
Common params for GaussianMixture and GaussianMixtureModel
- GaussianMixtureSummary - Class in org.apache.spark.ml.clustering
-
Experimental
Summary of GaussianMixture.
- gaussians() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
- gaussians() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- gaussiansDF() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
Retrieve Gaussian distributions as a DataFrame.
- GBTClassificationModel - Class in org.apache.spark.ml.classification
-
Gradient-Boosted Trees (GBTs) (http://en.wikipedia.org/wiki/Gradient_boosting)
model for classification.
- GBTClassificationModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.classification.GBTClassificationModel
-
Construct a GBTClassificationModel
- GBTClassifier - Class in org.apache.spark.ml.classification
-
Gradient-Boosted Trees (GBTs) (http://en.wikipedia.org/wiki/Gradient_boosting)
learning algorithm for classification.
- GBTClassifier(String) - Constructor for class org.apache.spark.ml.classification.GBTClassifier
-
- GBTClassifier() - Constructor for class org.apache.spark.ml.classification.GBTClassifier
-
- GBTClassifierParams - Interface in org.apache.spark.ml.tree
-
- GBTParams - Interface in org.apache.spark.ml.tree
-
Parameters for Gradient-Boosted Tree algorithms.
- GBTRegressionModel - Class in org.apache.spark.ml.regression
-
- GBTRegressionModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.regression.GBTRegressionModel
-
Construct a GBTRegressionModel
- GBTRegressor - Class in org.apache.spark.ml.regression
-
- GBTRegressor(String) - Constructor for class org.apache.spark.ml.regression.GBTRegressor
-
- GBTRegressor() - Constructor for class org.apache.spark.ml.regression.GBTRegressor
-
- GBTRegressorParams - Interface in org.apache.spark.ml.tree
-
- GC_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
-
- GC_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- gemm(double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.ml.linalg.BLAS
-
C := alpha * A * B + beta * C
- gemm(double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
C := alpha * A * B + beta * C
- gemv(double, Matrix, Vector, double, DenseVector) - Static method in class org.apache.spark.ml.linalg.BLAS
-
y := alpha * A * x + beta * y
- gemv(double, Matrix, Vector, double, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
y := alpha * A * x + beta * y
- GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression
-
Developer API
GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
- GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
- GeneralizedLinearModel - Class in org.apache.spark.mllib.regression
-
Developer API
GeneralizedLinearModel (GLM) represents a model trained using
GeneralizedLinearAlgorithm.
- GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
- GeneralizedLinearRegression - Class in org.apache.spark.ml.regression
-
Experimental
- GeneralizedLinearRegression(String) - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
- GeneralizedLinearRegression() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
- GeneralizedLinearRegression.Binomial$ - Class in org.apache.spark.ml.regression
-
Binomial exponential family distribution.
- GeneralizedLinearRegression.CLogLog$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.Family$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.FamilyAndLink$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.Gamma$ - Class in org.apache.spark.ml.regression
-
Gamma exponential family distribution.
- GeneralizedLinearRegression.Gaussian$ - Class in org.apache.spark.ml.regression
-
Gaussian exponential family distribution.
- GeneralizedLinearRegression.Identity$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.Inverse$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.Link$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.Log$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.Logit$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.Poisson$ - Class in org.apache.spark.ml.regression
-
Poisson exponential family distribution.
- GeneralizedLinearRegression.Probit$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.Sqrt$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegression.Tweedie$ - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegressionBase - Interface in org.apache.spark.ml.regression
-
Params for Generalized Linear Regression.
- GeneralizedLinearRegressionModel - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegressionSummary - Class in org.apache.spark.ml.regression
-
- GeneralizedLinearRegressionTrainingSummary - Class in org.apache.spark.ml.regression
-
- GeneralMLWritable - Interface in org.apache.spark.ml.util
-
Trait for classes that provide GeneralMLWriter
.
- GeneralMLWriter - Class in org.apache.spark.ml.util
-
A ML Writer which delegates based on the requested format.
- GeneralMLWriter(PipelineStage) - Constructor for class org.apache.spark.ml.util.GeneralMLWriter
-
- generateAssociationRules(double) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
-
Generates association rules for the Item
s in freqItemsets
.
- generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
-
Generate an RDD containing test data for KMeans.
- generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
For compatibility, the generated data without specifying the mean and variance
will have zero mean and variance of (1.0/3.0) since the original output range is
[-1, 1] with uniform distribution, and the variance of uniform distribution
is (b - a)^2^ / 12 which will be (1.0/3.0)
- generateLinearInput(double, double[], double[], double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
- generateLinearInput(double, double[], double[], double[], int, int, double, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
- generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
Return a Java List of synthetic data randomly generated according to a multi
collinear model.
- generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso,
and unregularized variants.
- generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
Generate an RDD containing test data for LogisticRegression.
- generateRandomEdges(int, int, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- generateRolledOverFileSuffix() - Method in interface org.apache.spark.util.logging.RollingPolicy
-
Get the desired name of the rollover file
- geq(Object) - Method in class org.apache.spark.sql.Column
-
Greater than or equal to an expression.
- get(Object) - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
-
- get() - Method in class org.apache.spark.api.java.Optional
-
- get() - Static method in class org.apache.spark.BarrierTaskContext
-
Experimental
Returns the currently active BarrierTaskContext.
- get() - Method in interface org.apache.spark.FutureAction
-
Blocks and returns the result of this job.
- get(String) - Method in interface org.apache.spark.internal.config.ConfigProvider
-
- get(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
-
Optionally returns the value associated with a param.
- get(Param<T>) - Method in interface org.apache.spark.ml.param.Params
-
Optionally returns the user-supplied value of a param.
- get(String) - Method in class org.apache.spark.SparkConf
-
Get a parameter; throws a NoSuchElementException if it's not set
- get(String, String) - Method in class org.apache.spark.SparkConf
-
Get a parameter, falling back to a default if not set
- get() - Static method in class org.apache.spark.SparkEnv
-
Returns the SparkEnv.
- get(String) - Static method in class org.apache.spark.SparkFiles
-
Get the absolute path of a file added through SparkContext.addFile()
.
- get(String) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects
-
Fetch the JdbcDialect class corresponding to a given database url.
- get(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i.
- get(String) - Method in class org.apache.spark.sql.RuntimeConfig
-
Returns the value of Spark runtime configuration property for the given key.
- get(String, String) - Method in class org.apache.spark.sql.RuntimeConfig
-
Returns the value of Spark runtime configuration property for the given key.
- get(String) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
Returns the option value to which the specified key is mapped, case-insensitively.
- get() - Method in interface org.apache.spark.sql.sources.v2.reader.InputPartitionReader
-
Return the current record.
- get() - Method in interface org.apache.spark.sql.streaming.GroupState
-
Get the state value if it exists, or throw NoSuchElementException.
- get(UUID) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
-
Returns the query if there is an active query with the given id, or null.
- get(String) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
-
Returns the query if there is an active query with the given id, or null.
- get(int, DataType) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- get(int, DataType) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- get() - Method in class org.apache.spark.streaming.State
-
Get the state if it exists, otherwise it will throw java.util.NoSuchElementException
.
- get() - Static method in class org.apache.spark.TaskContext
-
Return the currently active TaskContext.
- get(long) - Static method in class org.apache.spark.util.AccumulatorContext
-
- get_json_object(Column, String) - Static method in class org.apache.spark.sql.functions
-
Extracts json object from a json string based on json path specified, and returns json string
of the extracted json object.
- getAcceptanceResults(RDD<Tuple2<K, V>>, boolean, Map<K, Object>, Option<Map<K, Object>>, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Count the number of items instantly accepted and generate the waitlist for each stratum.
- getActive() - Static method in class org.apache.spark.streaming.StreamingContext
-
Experimental
- getActiveJobIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns an array containing the ids of all active jobs.
- getActiveJobIds() - Method in class org.apache.spark.SparkStatusTracker
-
Returns an array containing the ids of all active jobs.
- getActiveOrCreate(Function0<StreamingContext>) - Static method in class org.apache.spark.streaming.StreamingContext
-
Experimental
- getActiveOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
-
Experimental
- getActiveSession() - Static method in class org.apache.spark.sql.SparkSession
-
Returns the active SparkSession for the current thread, returned by the builder.
- getActiveStageIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns an array containing the ids of all active stages.
- getActiveStageIds() - Method in class org.apache.spark.SparkStatusTracker
-
Returns an array containing the ids of all active stages.
- getAggregationDepth() - Method in interface org.apache.spark.ml.param.shared.HasAggregationDepth
-
- getAlgo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getAll() - Method in class org.apache.spark.SparkConf
-
Get all parameters as a list of pairs
- getAll() - Method in class org.apache.spark.sql.RuntimeConfig
-
Returns all properties set in this conf.
- getAllConfs() - Method in class org.apache.spark.sql.SQLContext
-
Return all the configuration properties that have been set (i.e.
- getAllPools() - Method in class org.apache.spark.SparkContext
-
Developer API
Return pools for fair scheduler
- getAllPrefLocs(RDD<?>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer.PartitionLocations
-
- GetAllReceiverInfo - Class in org.apache.spark.streaming.scheduler
-
- GetAllReceiverInfo() - Constructor for class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
-
- getAllWithPrefix(String) - Method in class org.apache.spark.SparkConf
-
Get all parameters that start with prefix
- getAlpha() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getAlpha() - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for getDocConcentration
- getAnyValAs(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i.
- getAppId() - Method in interface org.apache.spark.launcher.SparkAppHandle
-
Returns the application ID, or null
if not yet known.
- getAppId() - Method in class org.apache.spark.SparkConf
-
Returns the Spark application id, valid in the Driver after TaskScheduler registration and
from the start in the Executor.
- getApplicationInfo(String) - Method in interface org.apache.spark.status.api.v1.UIRoot
-
- getApplicationInfoList() - Method in interface org.apache.spark.status.api.v1.UIRoot
-
- getArray(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getArray(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getArray(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getArray(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the array type value for rowId.
- getAs(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i.
- getAs(String) - Method in interface org.apache.spark.sql.Row
-
Returns the value of a given fieldName.
- getAssociationRulesFromFP(Dataset<?>, String, String, double, Map<T, Object>, ClassTag<T>) - Static method in class org.apache.spark.ml.fpm.AssociationRules
-
Computes the association rules with confidence above minConfidence.
- getAsymmetricAlpha() - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for getAsymmetricDocConcentration
- getAsymmetricDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "alpha") for the prior placed on documents'
distributions over topics ("theta").
- getAttr(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Gets an attribute by its name.
- getAttr(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Gets an attribute by its index.
- getAvroSchema() - Method in class org.apache.spark.SparkConf
-
Gets all the avro schemas in the configuration used in the generic Avro record serializer
- getBatchingTimeout(SparkConf) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
How long we will wait for the wrappedLog in the BatchedWriteAheadLog to write the records
before we fail the write attempt to unblock receivers.
- getBernoulliSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Return the per partition sampling function used for sampling without replacement.
- getBeta() - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for getTopicConcentration
- getBinary() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
-
- getBinary() - Method in class org.apache.spark.ml.feature.HashingTF
-
- getBinary(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getBinary(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getBinary(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getBinary(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the binary type value for rowId.
- getBinaryWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getBinaryWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getBlockSize() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams
-
- GetBlockStatus(BlockId, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
-
- GetBlockStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
-
- getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a boolean, falling back to a default if not set
- getBoolean(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive boolean.
- getBoolean(String, boolean) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
Returns the boolean value to which the specified key is mapped,
or defaultValue if there is no mapping for the key.
- getBoolean(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Boolean.
- getBoolean(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getBoolean(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getBoolean(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getBoolean(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the boolean type value for rowId.
- getBooleanArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Boolean array.
- getBooleans(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Gets boolean type values from [rowId, rowId + count).
- getBooleanWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getBooleanWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getBucketLength() - Method in interface org.apache.spark.ml.feature.BucketedRandomProjectionLSHParams
-
- getBuilder() - Method in class org.apache.spark.storage.memory.DeserializedValuesHolder
-
- getBuilder() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
-
- getBuilder() - Method in interface org.apache.spark.storage.memory.ValuesHolder
-
Note: After this method is called, the ValuesHolder is invalid, we can't store data and
get estimate size again.
- getByte(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive byte.
- getByte(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getByte(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getByte(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getByte(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the byte type value for rowId.
- getBytes(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Gets byte type values from [rowId, rowId + count).
- getByteWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getByteWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
-
- getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
-
The three methods below are helpers for accessing the local map, a property of the SparkEnv of
the local process.
- getCacheNodeIds() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- getCallSite(Function1<String, Object>) - Static method in class org.apache.spark.util.Utils
-
When called inside a class in the spark package, returns the name of the user code class
(outside the spark package) that called into Spark, as well as which Spark method they called.
- getCaseSensitive() - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
Get the custom datatype mapping for the given jdbc meta information.
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- getCategoricalCols() - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- getCategoricalFeatures(StructField) - Static method in class org.apache.spark.ml.util.MetadataUtils
-
Examine a schema to identify categorical (Binary and Nominal) features.
- getCategoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getCensorCol() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams
-
- getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- getCheckpointDir() - Method in class org.apache.spark.SparkContext
-
- getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Gets the name of the file to which this RDD was checkpointed
- getCheckpointFile() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- getCheckpointFile() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- getCheckpointFile() - Method in class org.apache.spark.rdd.RDD
-
Gets the name of the directory to which this RDD was checkpointed.
- getCheckpointFiles() - Method in class org.apache.spark.graphx.Graph
-
Gets the name of the files to which this Graph was checkpointed.
- getCheckpointFiles() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- getCheckpointFiles() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
-
Developer API
- getCheckpointInterval() - Method in interface org.apache.spark.ml.param.shared.HasCheckpointInterval
-
- getCheckpointInterval() - Method in class org.apache.spark.mllib.clustering.LDA
-
Period (in iterations) between checkpoints.
- getCheckpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getChild(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getClassifier() - Method in interface org.apache.spark.ml.classification.OneVsRestParams
-
- getColdStartStrategy() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams
-
- getCollectSubModels() - Method in interface org.apache.spark.ml.param.shared.HasCollectSubModels
-
- getCombOp() - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Returns the function used combine results returned by seqOp from different partitions.
- getComment() - Method in class org.apache.spark.sql.types.StructField
-
Return the comment of this StructField.
- getConf() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Return a copy of this JavaSparkContext's configuration.
- getConf() - Method in interface org.apache.spark.input.Configurable
-
- getConf() - Method in class org.apache.spark.rdd.HadoopRDD
-
- getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getConf() - Method in class org.apache.spark.SparkContext
-
Return a copy of this SparkContext's configuration.
- getConf(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the configuration for the given key in the current session.
- getConf(String) - Method in class org.apache.spark.sql.SQLContext
-
Return the value of Spark SQL configuration property for the given key.
- getConf(String, String) - Method in class org.apache.spark.sql.SQLContext
-
Return the value of Spark SQL configuration property for the given key.
- getConfiguration() - Method in class org.apache.spark.input.PortableDataStream
-
- getConfiguredLocalDirs(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Return the configured local directories where Spark can write files.
- getConnection() - Method in interface org.apache.spark.rdd.JdbcRDD.ConnectionFactory
-
- getContextOrSparkClassLoader() - Static method in class org.apache.spark.util.Utils
-
Get the Context ClassLoader on this thread or, if not present, the ClassLoader that
loaded Spark.
- getConvergenceTol() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the largest change in log-likelihood at which convergence is
considered to have occurred.
- getCorrelationFromName(String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
-
- getCount() - Method in class org.apache.spark.storage.CountingWritableChannel
-
- getCurrentProcessingTimeMs() - Method in interface org.apache.spark.sql.streaming.GroupState
-
Get the current processing time as milliseconds in epoch time.
- getCurrentUserGroups(SparkConf, String) - Static method in class org.apache.spark.util.Utils
-
- getCurrentUserName() - Static method in class org.apache.spark.util.Utils
-
Returns the current user name.
- getCurrentWatermarkMs() - Method in interface org.apache.spark.sql.streaming.GroupState
-
Get the current event time watermark as milliseconds in epoch time.
- getData(Row) - Static method in class org.apache.spark.ml.image.ImageSchema
-
Gets the image data
- getDatabase(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Get the database with the specified name.
- getDatabase(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the metadata for specified database, throwing an exception if it doesn't exist
- getDate(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of date type as java.sql.Date.
- getDateWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getDateWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getDecimal(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of decimal type as java.math.BigDecimal.
- getDecimal(int, int, int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getDecimal(int, int, int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getDecimal(int, int, int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getDecimal(int, int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the decimal type value for rowId.
- getDecimalWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getDecimalWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params
-
Gets the default value of a parameter.
- getDefaultPropertiesFile(Map<String, String>) - Static method in class org.apache.spark.util.Utils
-
Return the path of the default Spark properties file.
- getDefaultSession() - Static method in class org.apache.spark.sql.SparkSession
-
Returns the default SparkSession that is returned by the builder.
- getDegree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-
- getDenseSizeInBytes() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Gets the size of the dense representation of this `Matrix`.
- getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-
- getDeprecatedConfig(String, Map<String, String>) - Static method in class org.apache.spark.SparkConf
-
Looks for available deprecated keys for the given config option, and return the first
value available.
- getDistanceMeasure() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- getDistanceMeasure() - Method in interface org.apache.spark.ml.param.shared.HasDistanceMeasure
-
- getDistanceMeasure() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
The distance suite used by the algorithm.
- getDistanceMeasure() - Method in class org.apache.spark.mllib.clustering.KMeans
-
The distance suite used by the algorithm.
- getDistributions() - Method in class org.apache.spark.status.LiveRDD
-
- getDocConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "alpha") for the prior placed on documents'
distributions over topics ("theta").
- getDouble(String, double) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a double, falling back to a default if not ste
- getDouble(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive double.
- getDouble(String, double) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
Returns the double value to which the specified key is mapped,
or defaultValue if there is no mapping for the key.
- getDouble(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Double.
- getDouble(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getDouble(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getDouble(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getDouble(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the double type value for rowId.
- getDoubleArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Double array.
- getDoubles(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Gets double type values from [rowId, rowId + count).
- getDoubleWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getDoubleWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getDriverLogUrls() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
Get the URLs for the driver logs.
- getDropLast() - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
- getDropLast() - Method in interface org.apache.spark.ml.feature.OneHotEncoderBase
-
- getDstCol() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
-
- getDynamicAllocationInitialExecutors(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Return the initial number of executors for dynamic allocation.
- getElasticNetParam() - Method in interface org.apache.spark.ml.param.shared.HasElasticNetParam
-
- getEncryptionEnabled(JavaSparkContext) - Static method in class org.apache.spark.api.r.RUtils
-
- getEndOffset() - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader
-
Return the specified (if explicitly set through setOffsetRange) or inferred end offset
for this reader.
- getEndTimeEpoch() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- getEpsilon() - Method in interface org.apache.spark.ml.regression.LinearRegressionParams
-
- getEpsilon() - Method in class org.apache.spark.mllib.clustering.KMeans
-
The distance threshold within which we've consider centers to have converged.
- getEstimator() - Method in interface org.apache.spark.ml.tuning.ValidatorParams
-
- getEstimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.ValidatorParams
-
- getEvaluator() - Method in interface org.apache.spark.ml.tuning.ValidatorParams
-
- getExecutionContext() - Method in interface org.apache.spark.ml.param.shared.HasParallelism
-
Create a new execution context with a thread-pool that has a maximum number of threads
set to the value of parallelism
.
- GetExecutorEndpointRef(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetExecutorEndpointRef
-
- GetExecutorEndpointRef$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetExecutorEndpointRef$
-
- getExecutorEnv() - Method in class org.apache.spark.SparkConf
-
Get all executor environment variables set on this SparkConf
- getExecutorIds() - Method in interface org.apache.spark.ExecutorAllocationClient
-
Get the list of currently active executors
- getExecutorInfos() - Method in class org.apache.spark.SparkStatusTracker
-
Returns information of all known executors, including host, port, cacheSize, numRunningTasks
and memory metrics.
- GetExecutorLossReason(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.GetExecutorLossReason
-
- GetExecutorLossReason$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.GetExecutorLossReason$
-
- getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext
-
Return a map from the slave to the max memory available for caching and the remaining
memory available for caching.
- getExternalScratchDir(URI, Configuration, String) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
-
- getExternalTmpPath(SparkSession, Configuration, Path) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
-
- getExtTmpPathRelTo(Path, Configuration, String) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
-
- getFamily() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
- getFamily() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
- getFdr() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
- getFeatureIndex() - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase
-
- getFeatureIndicesFromNames(StructField, String[]) - Static method in class org.apache.spark.ml.util.MetadataUtils
-
Takes a Vector column and a list of feature names, and returns the corresponding list of
feature indices in the column, in order.
- getFeaturesAndLabels(RFormulaModel, Dataset<?>) - Static method in class org.apache.spark.ml.r.RWrapperUtils
-
Get the feature names and original labels from the schema
of DataFrame transformed by RFormulaModel.
- getFeaturesCol() - Method in interface org.apache.spark.ml.param.shared.HasFeaturesCol
-
- getFeatureSubsetStrategy() - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams
-
- getField(String) - Method in class org.apache.spark.sql.Column
-
An expression that gets a field by name in a StructType
.
- getFileLength(File, SparkConf) - Static method in class org.apache.spark.util.Utils
-
Return the file length, if the file is compressed it returns the uncompressed file length.
- getFileReader(String, Option<Configuration>, boolean) - Static method in class org.apache.spark.sql.hive.orc.OrcFileOperator
-
Retrieves an ORC file reader from a given path.
- getFileSegmentLocations(String, long, long, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
Get the locations of the HDFS blocks containing the given file segment.
- getFileSystemForPath(Path, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
- getFinalStorageLevel() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getFinalValue() - Method in class org.apache.spark.partial.PartialResult
-
Blocking method to wait for and return the final value.
- getFitIntercept() - Method in interface org.apache.spark.ml.param.shared.HasFitIntercept
-
- getFloat(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive float.
- getFloat(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getFloat(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getFloat(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getFloat(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the float type value for rowId.
- getFloats(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Gets float type values from [rowId, rowId + count).
- getFloatWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getFloatWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getForceIndexLabel() - Method in interface org.apache.spark.ml.feature.RFormulaBase
-
- getFormattedClassName(Object) - Static method in class org.apache.spark.util.Utils
-
Return the class name of the given object, removing all dollar signs
- getFormula() - Method in interface org.apache.spark.ml.feature.RFormulaBase
-
- getFpr() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
- getFunction(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Get the function with the specified name.
- getFunction(String, String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Get the function with the specified name.
- getFunction(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Return an existing function in the database, assuming it exists.
- getFunctionOption(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Return an existing function in the database, or None if it doesn't exist.
- getFwe() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
- getGaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- getGroups(String) - Method in interface org.apache.spark.security.GroupMappingServiceProvider
-
Get the groups the user belongs to.
- getHadoopFileSystem(URI, Configuration) - Static method in class org.apache.spark.util.Utils
-
Return a Hadoop FileSystem with the scheme encoded in the given path.
- getHadoopFileSystem(String, Configuration) - Static method in class org.apache.spark.util.Utils
-
Return a Hadoop FileSystem with the scheme encoded in the given path.
- getHandleInvalid() - Method in interface org.apache.spark.ml.param.shared.HasHandleInvalid
-
- getHeight(Row) - Static method in class org.apache.spark.ml.image.ImageSchema
-
Gets the height of the image
- getHiveWriteCompression(TableDesc, SQLConf) - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- getImplicitPrefs() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getImpurity() - Method in interface org.apache.spark.ml.tree.TreeClassifierParams
-
- getImpurity() - Method in interface org.apache.spark.ml.tree.TreeRegressorParams
-
- getImpurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getIndices() - Method in class org.apache.spark.ml.feature.VectorSlicer
-
- getInitializationMode() - Method in class org.apache.spark.mllib.clustering.KMeans
-
The initialization algorithm.
- getInitializationSteps() - Method in class org.apache.spark.mllib.clustering.KMeans
-
Number of steps for the k-means|| initialization mode
- getInitialModel() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the user supplied initial GMM, if supplied
- getInitialPositionInStream(int) - Method in class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
-
- getInitialTargetExecutorNumber(SparkConf, int) - Static method in class org.apache.spark.scheduler.cluster.SchedulerBackendUtils
-
Getting the initial target number of executors depends on whether dynamic allocation is
enabled.
- getInitialWeights() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams
-
- getInitMode() - Method in interface org.apache.spark.ml.clustering.KMeansParams
-
- getInitMode() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
-
- getInitSteps() - Method in interface org.apache.spark.ml.clustering.KMeansParams
-
- getInputCol() - Method in interface org.apache.spark.ml.param.shared.HasInputCol
-
- getInputCols() - Method in interface org.apache.spark.ml.param.shared.HasInputCols
-
- getInputFilePath() - Static method in class org.apache.spark.rdd.InputFileBlockHolder
-
Returns the holding file name or empty string if it is unknown.
- getInputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
- getInt(String, int) - Method in class org.apache.spark.SparkConf
-
Get a parameter as an integer, falling back to a default if not set
- getInt(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive int.
- getInt(String, int) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
Returns the integer value to which the specified key is mapped,
or defaultValue if there is no mapping for the key.
- getInt(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getInt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getInt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getInt(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the int type value for rowId.
- getIntermediateStorageLevel() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getInterval(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getInterval(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getInterval(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the calendar interval type value for rowId.
- getInts(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Gets int type values from [rowId, rowId + count).
- getIntWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getIntWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getInverse() - Method in class org.apache.spark.ml.feature.DCT
-
- getIsotonic() - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase
-
- getItem(Object) - Method in class org.apache.spark.sql.Column
-
An expression that gets an item at position ordinal
out of an array,
or gets a value by key key
in a MapType
.
- getItemCol() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams
-
- getItemsCol() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
-
- getIteratorSize(Iterator<?>) - Static method in class org.apache.spark.util.Utils
-
Counts the number of elements of an iterator using a while loop rather than calling
TraversableOnce.size()
because it uses a for loop, which is slightly slower
in the current version of Scala.
- getIteratorZipWithIndex(Iterator<T>, long) - Static method in class org.apache.spark.util.Utils
-
Generate a zipWithIndex iterator, avoid index value overflowing problem
in scala's zipWithIndex
- getJavaMap(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of array type as a java.util.Map
.
- getJavaSparkContext(SparkSession) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
Retrieve the jdbc / sql type for a given datatype.
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- getJobIdsForGroup(String) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Return a list of all known jobs in a particular job group.
- getJobIdsForGroup(String) - Method in class org.apache.spark.SparkStatusTracker
-
Return a list of all known jobs in a particular job group.
- getJobInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns job information, or null
if the job info could not be found or was garbage collected.
- getJobInfo(int) - Method in class org.apache.spark.SparkStatusTracker
-
Returns job information, or None
if the job info could not be found or was garbage collected.
- getK() - Method in interface org.apache.spark.ml.clustering.BisectingKMeansParams
-
- getK() - Method in interface org.apache.spark.ml.clustering.GaussianMixtureParams
-
- getK() - Method in interface org.apache.spark.ml.clustering.KMeansParams
-
- getK() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getK() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
-
- getK() - Method in interface org.apache.spark.ml.feature.PCAParams
-
- getK() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Gets the desired number of leaf clusters.
- getK() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the number of Gaussians in the mixture model
- getK() - Method in class org.apache.spark.mllib.clustering.KMeans
-
Number of clusters to create (k).
- getK() - Method in class org.apache.spark.mllib.clustering.LDA
-
Number of topics to infer, i.e., the number of soft cluster centers.
- getKappa() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Learning rate: exponential decay rate
- getKeepLastCheckpoint() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getKeepLastCheckpoint() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-
If using checkpointing, this indicates whether to keep the last checkpoint (vs clean up).
- getLabelCol() - Method in interface org.apache.spark.ml.param.shared.HasLabelCol
-
- getLabels() - Method in class org.apache.spark.ml.feature.IndexToString
-
- getLambda() - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Get the smoothing parameter.
- getLastUpdatedEpoch() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- getLayers() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams
-
- getLDAModel(double[]) - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer
-
- getLearningDecay() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getLearningOffset() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getLearningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getLeastGroupHash(String) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
Sorts and gets the least element of the list associated with key in groupHash
The returned PartitionGroup is the least loaded of all groups that represent the machine "key"
- getLength() - Static method in class org.apache.spark.rdd.InputFileBlockHolder
-
Returns the length of the block being read, or -1 if it is unknown.
- getLink() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
- getLinkPower() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
- getLinkPredictionCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
- getList(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of array type as java.util.List
.
- getLocalDir(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Get the path of a temporary directory.
- getLocale() - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get a local property set in this thread, or null if it is missing.
- getLocalProperty(String) - Method in class org.apache.spark.BarrierTaskContext
-
- getLocalProperty(String) - Method in class org.apache.spark.SparkContext
-
Get a local property set in this thread, or null if it is missing.
- getLocalProperty(String) - Method in class org.apache.spark.TaskContext
-
Get a local property set upstream in the driver, or null if it is missing.
- getLocalUserJarsForShell(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Return the local jar files which will be added to REPL's classpath.
- GetLocations(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations
-
- GetLocations$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations$
-
- GetLocationsAndStatus(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsAndStatus
-
- GetLocationsAndStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsAndStatus$
-
- GetLocationsMultipleBlockIds(BlockId[]) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
-
- GetLocationsMultipleBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
-
- getLong(String, long) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a long, falling back to a default if not set
- getLong(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive long.
- getLong(String, long) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
Returns the long value to which the specified key is mapped,
or defaultValue if there is no mapping for the key.
- getLong(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Long.
- getLong(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getLong(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getLong(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getLong(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the long type value for rowId.
- getLongArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Long array.
- getLongs(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Gets long type values from [rowId, rowId + count).
- getLongWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getLongWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getLoss() - Method in interface org.apache.spark.ml.param.shared.HasLoss
-
- getLoss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getLossType() - Method in interface org.apache.spark.ml.tree.GBTClassifierParams
-
- getLossType() - Method in interface org.apache.spark.ml.tree.GBTRegressorParams
-
- getLowerBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds
-
Returns a threshold p
such that if we conduct n Bernoulli trials with success rate = p
,
it is very unlikely to have more than fraction * n
successes.
- getLowerBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds
-
Returns a lambda such that Pr[X > s] is very small, where X ~ Pois(lambda).
- getLowerBoundsOnCoefficients() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
- getLowerBoundsOnIntercepts() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
- getMap(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of map type as a Scala Map.
- getMap(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getMap(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getMap(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getMap(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the map type value for rowId.
- GetMatchingBlockIds(Function1<BlockId, Object>, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
-
- GetMatchingBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
-
- getMax() - Method in interface org.apache.spark.ml.feature.MinMaxScalerParams
-
- getMaxBins() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- getMaxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMaxCategories() - Method in interface org.apache.spark.ml.feature.VectorIndexerParams
-
- getMaxDepth() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- getMaxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMaxDF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
-
- getMaxFailures(SparkConf, boolean) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- getMaxIter() - Method in interface org.apache.spark.ml.param.shared.HasMaxIter
-
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Gets the max number of k-means iterations to split clusters.
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the maximum number of iterations allowed
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.KMeans
-
Maximum number of iterations allowed.
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.LDA
-
Maximum number of iterations allowed.
- getMaxLocalProjDBSize() - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
- getMaxLocalProjDBSize() - Method in class org.apache.spark.mllib.fpm.PrefixSpan
-
Gets the maximum number of items allowed in a projected database before local processing.
- getMaxMemoryInMB() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- getMaxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMaxPatternLength() - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
- getMaxPatternLength() - Method in class org.apache.spark.mllib.fpm.PrefixSpan
-
Gets the maximal pattern length (i.e.
- getMaxSentenceLength() - Method in interface org.apache.spark.ml.feature.Word2VecBase
-
- GetMemoryStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
-
- getMessage() - Method in exception org.apache.spark.sql.AnalysisException
-
- getMetadata(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Metadata.
- getMetadataArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Metadata array.
- getMetricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- getMetricName() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- getMetricName() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- getMetricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- getMetricsSources(String) - Method in class org.apache.spark.BarrierTaskContext
-
- getMetricsSources(String) - Method in class org.apache.spark.TaskContext
-
::DeveloperApi::
Returns all metrics sources with the given name which are associated with the instance
which runs the task.
- getMin() - Method in interface org.apache.spark.ml.feature.MinMaxScalerParams
-
- getMinConfidence() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
-
- getMinCount() - Method in interface org.apache.spark.ml.feature.Word2VecBase
-
- getMinDF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
-
- getMinDivisibleClusterSize() - Method in interface org.apache.spark.ml.clustering.BisectingKMeansParams
-
- getMinDivisibleClusterSize() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Gets the minimum number of points (if greater than or equal to 1.0
) or the minimum proportion
of points (if less than 1.0
) of a divisible cluster.
- getMinDocFreq() - Method in interface org.apache.spark.ml.feature.IDFBase
-
- getMiniBatchFraction() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Mini-batch fraction, which sets the fraction of document sampled and used in each iteration
- getMinInfoGain() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- getMinInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMinInstancesPerNode() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- getMinInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMinSupport() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
-
- getMinSupport() - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
- getMinSupport() - Method in class org.apache.spark.mllib.fpm.PrefixSpan
-
Get the minimal support (i.e.
- getMinTF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
-
- getMinTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- getMissingValue() - Method in interface org.apache.spark.ml.feature.ImputerParams
-
- getMode(Row) - Static method in class org.apache.spark.ml.image.ImageSchema
-
Gets the OpenCV representation as an int
- getModelType() - Method in interface org.apache.spark.ml.classification.NaiveBayesParams
-
- getModelType() - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Get the model type.
- getN() - Method in class org.apache.spark.ml.feature.NGram
-
- getNames() - Method in class org.apache.spark.ml.feature.VectorSlicer
-
- getNChannels(Row) - Static method in class org.apache.spark.ml.image.ImageSchema
-
Gets the number of channels in the image
- getNode(int, Node) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Traces down from a root node to get the node with the given node index.
- getNonnegative() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getNumBuckets() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase
-
- getNumBucketsArray() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase
-
- getNumClasses(StructField) - Static method in class org.apache.spark.ml.util.MetadataUtils
-
Examine a schema to identify the number of classes in a label column.
- getNumClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getNumFeatures() - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- getNumFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
-
- getNumFeatures() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
The dimension of training features.
- getNumFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
-
- getNumHashTables() - Method in interface org.apache.spark.ml.feature.LSHParams
-
- getNumItemBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getNumIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getNumObjFields() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- getNumPartitions() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the number of partitions in this RDD.
- getNumPartitions() - Method in interface org.apache.spark.ml.feature.Word2VecBase
-
- getNumPartitions() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
-
- getNumPartitions() - Method in class org.apache.spark.rdd.RDD
-
Returns the number of partitions of this RDD.
- getNumTopFeatures() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
- getNumTrees() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
Number of trees in ensemble
- getNumTrees() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
Number of trees in ensemble
- getNumTrees() - Method in interface org.apache.spark.ml.tree.RandomForestParams
-
- getNumUserBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getNumValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
Get the number of values, either from numValues
or from values
.
- getObjectInspector(String, Option<Configuration>) - Static method in class org.apache.spark.sql.hive.orc.OrcFileOperator
-
- getObjFieldValues(Object, Object[]) - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- getOffset() - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousInputPartitionReader
-
Get the offset of the current record, or the start offset if no records have been read.
- getOffsetCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
- getOldBoostingStrategy(Map<Object, Object>, Enumeration.Value) - Method in interface org.apache.spark.ml.tree.GBTParams
-
(private[ml]) Create a BoostingStrategy instance to use with the old API.
- getOldDocConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
Get docConcentration used by spark.mllib LDA
- getOldImpurity() - Method in interface org.apache.spark.ml.tree.TreeClassifierParams
-
Convert new impurity to old impurity.
- getOldImpurity() - Method in interface org.apache.spark.ml.tree.TreeRegressorParams
-
Convert new impurity to old impurity.
- getOldLossType() - Method in interface org.apache.spark.ml.tree.GBTClassifierParams
-
(private[ml]) Convert new loss to old loss.
- getOldLossType() - Method in interface org.apache.spark.ml.tree.GBTParams
-
Get old Gradient Boosting Loss type
- getOldLossType() - Method in interface org.apache.spark.ml.tree.GBTRegressorParams
-
(private[ml]) Convert new loss to old loss.
- getOldOptimizer() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getOldStrategy(Map<Object, Object>, int, Enumeration.Value, Impurity, double) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
(private[ml]) Create a Strategy instance to use with the old API.
- getOldStrategy(Map<Object, Object>, int, Enumeration.Value, Impurity) - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams
-
Create a Strategy instance to use with the old API.
- getOldTopicConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
Get topicConcentration used by spark.mllib LDA
- getOptimizeDocConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getOptimizeDocConcentration() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Optimize docConcentration, indicates whether docConcentration (Dirichlet parameter for
document-topic distribution) will be optimized during training.
- getOptimizer() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getOptimizer() - Method in class org.apache.spark.mllib.clustering.LDA
-
Developer API
- getOption(String) - Method in class org.apache.spark.SparkConf
-
Get a parameter as an Option
- getOption(String) - Method in class org.apache.spark.sql.RuntimeConfig
-
Returns the value of Spark runtime configuration property for the given key.
- getOption() - Method in interface org.apache.spark.sql.streaming.GroupState
-
Get the state value as a scala Option.
- getOption() - Method in class org.apache.spark.streaming.State
-
Get the state as a scala.Option
.
- getOrCreate(SparkConf) - Static method in class org.apache.spark.SparkContext
-
This function may be used to get or instantiate a SparkContext and register it as a
singleton object.
- getOrCreate() - Static method in class org.apache.spark.SparkContext
-
This function may be used to get or instantiate a SparkContext and register it as a
singleton object.
- getOrCreate() - Method in class org.apache.spark.sql.SparkSession.Builder
-
Gets an existing
SparkSession
or, if there is no existing one, creates a new
one based on the options set in this builder.
- getOrCreate(SparkContext) - Static method in class org.apache.spark.sql.SQLContext
-
- getOrCreate(String, Function0<JavaStreamingContext>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Function0<JavaStreamingContext>, Configuration) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Function0<JavaStreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreateSparkSession(JavaSparkContext, Map<Object, Object>, boolean) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- getOrDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params
-
Gets the value of a param in the embedded param map or its default value.
- getOrElse(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap
-
Returns the value associated with a param or a default value.
- getOrigin(Row) - Static method in class org.apache.spark.ml.image.ImageSchema
-
Gets the origin of the image
- getOutputAttrGroupFromData(Dataset<?>, Seq<String>, Seq<String>, boolean) - Static method in class org.apache.spark.ml.feature.OneHotEncoderCommon
-
This method is called when we want to generate AttributeGroup
from actual data for
one-hot encoder.
- getOutputCol() - Method in interface org.apache.spark.ml.param.shared.HasOutputCol
-
- getOutputCols() - Method in interface org.apache.spark.ml.param.shared.HasOutputCols
-
- getOutputSize(int) - Method in interface org.apache.spark.ml.ann.Layer
-
Returns the output size given the input size (not counting the stack size).
- getOutputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
- getP() - Method in class org.apache.spark.ml.feature.Normalizer
-
- getParallelism() - Method in interface org.apache.spark.ml.param.shared.HasParallelism
-
- getParam(String) - Method in interface org.apache.spark.ml.param.Params
-
Gets a param by its name.
- getParents(int) - Method in class org.apache.spark.NarrowDependency
-
Get the parent partitions for a child partition.
- getParents(int) - Method in class org.apache.spark.OneToOneDependency
-
- getParents(int) - Method in class org.apache.spark.RangeDependency
-
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-
- getPartition(long, long, int) - Method in interface org.apache.spark.graphx.PartitionStrategy
-
Returns the partition number for a given edge.
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-
- getPartition(Object) - Method in class org.apache.spark.HashPartitioner
-
- getPartition(Object) - Method in class org.apache.spark.Partitioner
-
- getPartition(Object) - Method in class org.apache.spark.RangePartitioner
-
- getPartition(String, String, Map<String, String>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the specified partition, or throws `NoSuchPartitionException`.
- getPartitionId() - Static method in class org.apache.spark.TaskContext
-
Returns the partition id of currently active TaskContext.
- getPartitionNames(CatalogTable, Option<Map<String, String>>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the partition names for the given table that match the supplied partition spec.
- getPartitionOption(String, String, Map<String, String>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the specified partition or None if it does not exist.
- getPartitionOption(CatalogTable, Map<String, String>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the specified partition or None if it does not exist.
- getPartitions() - Method in class org.apache.spark.api.r.BaseRRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
-
- getPartitions(String, String, Option<Map<String, String>>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the partitions for the given table that match the supplied partition spec.
- getPartitions(CatalogTable, Option<Map<String, String>>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the partitions for the given table that match the supplied partition spec.
- getPartitions() - Method in class org.apache.spark.status.LiveRDD
-
- getPartitionsByFilter(CatalogTable, Seq<Expression>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns partitions filtered by predicates for the given table.
- getPath() - Method in class org.apache.spark.input.PortableDataStream
-
- getPattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- GetPeers(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers
-
- GetPeers$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers$
-
- getPercentile() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
- getPersistentRDDs() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Returns a Java map of JavaRDDs that have marked themselves as persistent via cache() call.
- getPersistentRDDs() - Method in class org.apache.spark.SparkContext
-
Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
- getPmml() - Method in interface org.apache.spark.mllib.pmml.export.PMMLModelExport
-
- getPoissonSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Return the per partition sampling function used for sampling with replacement.
- getPoolForName(String) - Method in class org.apache.spark.SparkContext
-
Developer API
Return the pool associated with the given name, if one exists
- getPosition() - Method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions.AtTimestamp
-
- getPosition() - Method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions.Latest
-
- getPosition() - Method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions.TrimHorizon
-
- getPredictionCol() - Method in interface org.apache.spark.ml.param.shared.HasPredictionCol
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
-
- getPrimitiveNullWritableConstantObjectInspector() - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getProbabilityCol() - Method in interface org.apache.spark.ml.param.shared.HasProbabilityCol
-
- getProcessName() - Static method in class org.apache.spark.util.Utils
-
Returns the name of this JVM process.
- getPropertiesFromFile(String) - Static method in class org.apache.spark.util.Utils
-
Load properties present in the given file.
- getQuantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getQuantileProbabilities() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams
-
- getQuantilesCol() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams
-
- getRandomSample(Seq<T>, int, Random) - Static method in class org.apache.spark.storage.BlockReplicationUtils
-
Get a random sample of size m from the elems
- getRank() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getRatingCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getRawPredictionCol() - Method in interface org.apache.spark.ml.param.shared.HasRawPredictionCol
-
- getRDDStorageInfo() - Method in class org.apache.spark.SparkContext
-
Developer API
Return information about what RDDs are cached, if they are in mem or on disk, how much space
they take, etc.
- getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
Gets the receiver object that will be sent to the worker nodes
to receive data.
- getRegParam() - Method in interface org.apache.spark.ml.param.shared.HasRegParam
-
- getRelativeError() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase
-
- getResource(String) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
-
- getResources(String) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
-
- getRollingIntervalSecs(SparkConf, boolean) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- getRootDirectory() - Static method in class org.apache.spark.SparkFiles
-
Get the root directory that contains files added through SparkContext.addFile()
.
- getRow(int) - Method in class org.apache.spark.sql.vectorized.ColumnarBatch
-
Returns the row in this batch at `rowId`.
- getRuns() - Method in class org.apache.spark.mllib.clustering.KMeans
-
- getScalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-
- getSchedulableByName(String) - Method in interface org.apache.spark.scheduler.Schedulable
-
- getSchedulingMode() - Method in class org.apache.spark.SparkContext
-
Return current scheduling mode
- getSchemaQuery(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
- getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- getSchemaQuery(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
The SQL query that should be used to discover the schema of a table.
- getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- getSeed() - Method in interface org.apache.spark.ml.param.shared.HasSeed
-
- getSeed() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Gets the random seed.
- getSeed() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the random seed
- getSeed() - Method in class org.apache.spark.mllib.clustering.KMeans
-
The random seed for cluster initialization.
- getSeed() - Method in class org.apache.spark.mllib.clustering.LDA
-
Random seed for cluster initialization.
- getSeed() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
Random seed for cluster initialization.
- getSelectorType() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
- getSeq(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of array type as a Scala Seq.
- getSeqOp(boolean, Map<K, Object>, org.apache.spark.util.random.StratifiedSamplingUtils.RandomDataGenerator, Option<Map<K, Object>>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Returns the function used by aggregate to collect sampling statistics for each partition.
- getSequenceCol() - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
- getSerializationProxy(Object) - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
Check if the given reference is a indylambda style Scala closure.
- getSessionConf(SparkSession) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- getShort(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive short.
- getShort(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getShort(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getShort(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getShort(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the short type value for rowId.
- getShorts(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Gets short type values from [rowId, rowId + count).
- getShortWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getShortWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getSimpleMessage() - Method in exception org.apache.spark.sql.AnalysisException
-
- getSimpleName(Class<?>) - Static method in class org.apache.spark.util.Utils
-
Safer than Class obj's getSimpleName which may throw Malformed class name error in scala.
- getSize() - Method in class org.apache.spark.ml.feature.VectorSizeHint
-
group getParam
- getSizeAsBytes(String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as bytes; throws a NoSuchElementException if it's not set.
- getSizeAsBytes(String, String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as bytes, falling back to a default if not set.
- getSizeAsBytes(String, long) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as bytes, falling back to a default if not set.
- getSizeAsGb(String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Gibibytes; throws a NoSuchElementException if it's not set.
- getSizeAsGb(String, String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Gibibytes, falling back to a default if not set.
- getSizeAsKb(String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Kibibytes; throws a NoSuchElementException if it's not set.
- getSizeAsKb(String, String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Kibibytes, falling back to a default if not set.
- getSizeAsMb(String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Mebibytes; throws a NoSuchElementException if it's not set.
- getSizeAsMb(String, String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Mebibytes, falling back to a default if not set.
- getSizeForBlock(int) - Method in interface org.apache.spark.scheduler.MapStatus
-
Estimated size for the reduce block, in bytes.
- getSizeInBytes() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Gets the current size in bytes of this `Matrix`.
- getSlotDescs() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- getSmoothing() - Method in interface org.apache.spark.ml.classification.NaiveBayesParams
-
- getSolver() - Method in interface org.apache.spark.ml.param.shared.HasSolver
-
- getSortedTaskSetQueue() - Method in interface org.apache.spark.scheduler.Schedulable
-
- getSparkClassLoader() - Static method in class org.apache.spark.util.Utils
-
Get the ClassLoader which loaded Spark.
- getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get Spark's home location from either a value set through the constructor,
or the spark.home Java property, or the SPARK_HOME environment variable
(in that order of preference).
- getSparkOrYarnConfig(SparkConf, String, String) - Static method in class org.apache.spark.util.Utils
-
Return the value of a config either through the SparkConf or the Hadoop configuration.
- getSparseSizeInBytes(boolean) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Gets the size of the minimal sparse representation of this `Matrix`.
- getSplit() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData
-
- getSplits() - Method in class org.apache.spark.ml.feature.Bucketizer
-
- getSplitsArray() - Method in class org.apache.spark.ml.feature.Bucketizer
-
- getSrcCol() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
-
- getStageInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns stage information, or null
if the stage info could not be found or was
garbage collected.
- getStageInfo(int) - Method in class org.apache.spark.SparkStatusTracker
-
Returns stage information, or None
if the stage info could not be found or was
garbage collected.
- getStagePath(String, int, int, String) - Method in class org.apache.spark.ml.Pipeline.SharedReadWrite$
-
Get path for saving the given stage.
- getStages() - Method in class org.apache.spark.ml.Pipeline
-
- getStagingDir(Path, Configuration, String) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
-
- getStandardization() - Method in interface org.apache.spark.ml.param.shared.HasStandardization
-
- getStartOffset() - Static method in class org.apache.spark.rdd.InputFileBlockHolder
-
Returns the starting offset of the block currently being read, or -1 if it is unknown.
- getStartOffset() - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader
-
Return the specified or inferred start offset for this reader.
- getStartOffset() - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader
-
Returns the specified (if explicitly set through setOffsetRange) or inferred start offset
for this reader.
- getStartTimeEpoch() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- getState() - Method in interface org.apache.spark.launcher.SparkAppHandle
-
Returns the current application state.
- getState() - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Return the associated Hive SessionState of this HiveClientImpl
- getState() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Developer API
- getState() - Method in class org.apache.spark.streaming.StreamingContext
-
Developer API
- getStatement() - Method in class org.apache.spark.ml.feature.SQLTransformer
-
- getStderr(Process, long) - Static method in class org.apache.spark.util.Utils
-
Return the stderr of a process after waiting for the process to terminate.
- getStepSize() - Method in interface org.apache.spark.ml.param.shared.HasStepSize
-
- getStopWords() - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
- getStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- getStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- getStorageLevel() - Method in class org.apache.spark.rdd.RDD
-
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
- GetStorageStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
-
- getStrategy() - Method in interface org.apache.spark.ml.feature.ImputerParams
-
- getString(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a String object.
- getString(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a String.
- getStringArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a String array.
- getStringIndexerOrderType() - Method in interface org.apache.spark.ml.feature.RFormulaBase
-
- getStringOrderType() - Method in interface org.apache.spark.ml.feature.StringIndexerBase
-
- getStringWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getStringWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getStruct(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of struct type as a
Row
object.
- getStruct(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getStruct(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getStruct(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the struct type value for rowId.
- getSubsamplingRate() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getSubsamplingRate() - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams
-
- getSubsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getSystemProperties() - Static method in class org.apache.spark.util.Utils
-
Returns the system properties map that is thread-safe to iterator over.
- getTable(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Get the table or view with the specified name.
- getTable(String, String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Get the table or view with the specified name in the specified database.
- getTable(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the specified table, or throws `NoSuchTableException`.
- getTableExistsQuery(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- getTableExistsQuery(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
Get the SQL query that should be used to find if the given table exists.
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- getTableNames(SparkSession, String) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- getTableOption(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the metadata for the specified table or None if it doesn't exist.
- getTables(SparkSession, String) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- getTaskInfos() - Method in class org.apache.spark.BarrierTaskContext
-
Experimental
Returns
BarrierTaskInfo
for all tasks in this barrier stage, ordered by partition ID.
- getTau0() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
A (positive) learning parameter that downweights early iterations.
- getThreadDump() - Static method in class org.apache.spark.util.Utils
-
Return a thread dump of all threads' stacktraces.
- getThreadDumpForThread(long) - Static method in class org.apache.spark.util.Utils
-
- getThreshold() - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- getThreshold() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- getThreshold() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
Get threshold for binary classification.
- getThreshold() - Method in class org.apache.spark.ml.feature.Binarizer
-
- getThreshold() - Method in interface org.apache.spark.ml.param.shared.HasThreshold
-
- getThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
- getThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
-
Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
- getThresholds() - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- getThresholds() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- getThresholds() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
Get thresholds for binary or multiclass classification.
- getThresholds() - Method in interface org.apache.spark.ml.param.shared.HasThresholds
-
- getTimeAsMs(String) - Method in class org.apache.spark.SparkConf
-
Get a time parameter as milliseconds; throws a NoSuchElementException if it's not set.
- getTimeAsMs(String, String) - Method in class org.apache.spark.SparkConf
-
Get a time parameter as milliseconds, falling back to a default if not set.
- getTimeAsSeconds(String) - Method in class org.apache.spark.SparkConf
-
Get a time parameter as seconds; throws a NoSuchElementException if it's not set.
- getTimeAsSeconds(String, String) - Method in class org.apache.spark.SparkConf
-
Get a time parameter as seconds, falling back to a default if not set.
- getTimeMillis() - Method in interface org.apache.spark.util.Clock
-
- getTimer(L) - Method in interface org.apache.spark.util.ListenerBus
-
Returns a CodaHale metrics Timer for measuring the listener's event processing time.
- getTimestamp(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of date type as java.sql.Timestamp.
- getTimestamp() - Method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions.AtTimestamp
-
- getTimestampWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getTimestampWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- getTimeZoneOffset() - Static method in class org.apache.spark.ui.UIUtils
-
- GETTING_RESULT_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
-
- GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
-
- gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
The time when the task started remotely getting the result.
- gettingResultTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- gettingResultTime(TaskData) - Static method in class org.apache.spark.status.AppStatusUtils
-
- gettingResultTime(long, long, long) - Static method in class org.apache.spark.status.AppStatusUtils
-
- getTol() - Method in interface org.apache.spark.ml.param.shared.HasTol
-
- getToLowercase() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- getTopicConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getTopicConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
distributions over terms.
- getTopicDistributionCol() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
- getTopologyForHost(String) - Method in class org.apache.spark.storage.DefaultTopologyMapper
-
- getTopologyForHost(String) - Method in class org.apache.spark.storage.FileBasedTopologyMapper
-
- getTopologyForHost(String) - Method in class org.apache.spark.storage.TopologyMapper
-
Gets the topology information given the host name
- getTrainRatio() - Method in interface org.apache.spark.ml.tuning.TrainValidationSplitParams
-
- getTreeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getTruncateQuery(String, Option<Object>) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
The SQL query used to truncate a table.
- getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- getTruncateQuery(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
The SQL query that should be used to truncate a table.
- getTruncateQuery(String, Option<Object>) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
The SQL query that should be used to truncate a table.
- getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
The SQL query used to truncate a table.
- getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
The SQL query used to truncate a table.
- getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
The SQL query used to truncate a table.
- getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- getUDTFor(String) - Static method in class org.apache.spark.sql.types.UDTRegistration
-
Returns the Class of UserDefinedType for the name of a given user class.
- getUidMap(Params) - Static method in class org.apache.spark.ml.util.MetaAlgorithmReadWrite
-
Examine the given estimator (which may be a compound estimator) and extract a mapping
from UIDs to corresponding Params
instances.
- getUiRoot(ServletContext) - Static method in class org.apache.spark.status.api.v1.UIRootFromServletContext
-
- getUpperBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds
-
Returns a threshold p
such that if we conduct n Bernoulli trials with success rate = p
,
it is very unlikely to have less than fraction * n
successes.
- getUpperBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds
-
Returns a lambda such that Pr[X < s] is very small, where X ~ Pois(lambda).
- getUpperBoundsOnCoefficients() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
- getUpperBoundsOnIntercepts() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
- getUsedTimeMs(long) - Static method in class org.apache.spark.util.Utils
-
Return the string to tell how long has passed in milliseconds.
- getUseNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getUserCol() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams
-
- getUserJars(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Return the jar files pointed by the "spark.jars" property.
- getUTF8String(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- getUTF8String(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- getUTF8String(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- getUTF8String(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns the string type value for rowId.
- getValidationIndicatorCol() - Method in interface org.apache.spark.ml.param.shared.HasValidationIndicatorCol
-
- getValidationTol() - Method in interface org.apache.spark.ml.tree.GBTParams
-
- getValidationTol() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getValue(int) - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
Gets a value given its index.
- getValuesMap(Seq<String>) - Method in interface org.apache.spark.sql.Row
-
Returns a Map consisting of names and values for the requested fieldNames
For primitive types if value is null it returns 'zero value' specific for primitive
ie.
- getVarianceCol() - Method in interface org.apache.spark.ml.param.shared.HasVarianceCol
-
- getVariancePower() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
- getVectors() - Method in class org.apache.spark.ml.feature.Word2VecModel
-
Returns a dataframe with two fields, "word" and "vector", with "word" being a String and
and the vector the DenseVector that it is mapped to.
- getVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Returns a map of words to their vector representations.
- getVectorSize() - Method in interface org.apache.spark.ml.feature.Word2VecBase
-
- getVocabSize() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
-
- getWeightCol() - Method in interface org.apache.spark.ml.param.shared.HasWeightCol
-
- getWidth(Row) - Static method in class org.apache.spark.ml.image.ImageSchema
-
Gets the width of the image
- getWindowSize() - Method in interface org.apache.spark.ml.feature.Word2VecBase
-
- getWithMean() - Method in interface org.apache.spark.ml.feature.StandardScalerParams
-
- getWithStd() - Method in interface org.apache.spark.ml.feature.StandardScalerParams
-
- Gini - Class in org.apache.spark.mllib.tree.impurity
-
Class for calculating the Gini impurity
(http://en.wikipedia.org/wiki/Decision_tree_learning#Gini_impurity)
during multiclass classification.
- Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
-
- GLMClassificationModel - Class in org.apache.spark.mllib.classification.impl
-
Helper class for import/export of GLM classification models.
- GLMClassificationModel() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel
-
- GLMClassificationModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.classification.impl
-
- GLMClassificationModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.classification.impl
-
Model data for import/export
- GLMClassificationModel.SaveLoadV1_0$.Data$ - Class in org.apache.spark.mllib.classification.impl
-
- GLMRegressionModel - Class in org.apache.spark.mllib.regression.impl
-
Helper methods for import/export of GLM regression models.
- GLMRegressionModel() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel
-
- GLMRegressionModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.regression.impl
-
- GLMRegressionModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.regression.impl
-
Model data for model import/export
- GLMRegressionModel.SaveLoadV1_0$.Data$ - Class in org.apache.spark.mllib.regression.impl
-
- glom() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by coalescing all elements within each partition into an array.
- glom() - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by coalescing all elements within each partition into an array.
- glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
this DStream.
- glom() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
this DStream.
- goButtonFormPath() - Method in interface org.apache.spark.ui.PagedTable
-
Returns the submission path for the "go to page #" form.
- goodnessOfFit() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
-
- grad(DenseMatrix<Object>, DenseMatrix<Object>, DenseVector<Object>) - Method in interface org.apache.spark.ml.ann.LayerModel
-
Computes the gradient.
- grad() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- gradient() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator
-
The current weighted averaged gradient.
- gradient() - Method in class org.apache.spark.ml.regression.AFTAggregator
-
- Gradient - Class in org.apache.spark.mllib.optimization
-
Developer API
Class used to compute the gradient for a loss function, given a single data point.
- Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
-
- gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
-
Method to calculate the gradients for the gradient boosting calculation for least
absolute error calculation.
- gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
-
Method to calculate the loss gradients for the gradient boosting calculation for binary
classification
The gradient with respect to F(x) is: - 4 y / (1 + exp(2 y F(x)))
- gradient(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss
-
Method to calculate the gradients for the gradient boosting calculation.
- gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
-
Method to calculate the gradients for the gradient boosting calculation for least
squares error calculation.
- GradientBoostedTrees - Class in org.apache.spark.ml.tree.impl
-
- GradientBoostedTrees() - Constructor for class org.apache.spark.ml.tree.impl.GradientBoostedTrees
-
- GradientBoostedTrees - Class in org.apache.spark.mllib.tree
-
- GradientBoostedTrees(BoostingStrategy) - Constructor for class org.apache.spark.mllib.tree.GradientBoostedTrees
-
- GradientBoostedTreesModel - Class in org.apache.spark.mllib.tree.model
-
Represents a gradient boosted trees model.
- GradientBoostedTreesModel(Enumeration.Value, DecisionTreeModel[], double[]) - Constructor for class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- GradientDescent - Class in org.apache.spark.mllib.optimization
-
Class used to solve an optimization problem using Gradient Descent.
- gradientSumArray() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator
-
Array of gradient values that are mutated when new instances are added to the aggregator.
- Graph<VD,ED> - Class in org.apache.spark.graphx
-
The Graph abstractly represents a graph with arbitrary objects
associated with vertices and edges.
- GraphGenerators - Class in org.apache.spark.graphx.util
-
A collection of graph generating functions.
- GraphGenerators() - Constructor for class org.apache.spark.graphx.util.GraphGenerators
-
- GraphImpl<VD,ED> - Class in org.apache.spark.graphx.impl
-
An implementation of
Graph
to support computation on graphs.
- GraphLoader - Class in org.apache.spark.graphx
-
Provides utilities for loading
Graph
s from files.
- GraphLoader() - Constructor for class org.apache.spark.graphx.GraphLoader
-
- GraphOps<VD,ED> - Class in org.apache.spark.graphx
-
Contains additional functionality for
Graph
.
- GraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.GraphOps
-
- graphToGraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Implicitly extracts the
GraphOps
member from a graph.
- GraphXUtils - Class in org.apache.spark.graphx
-
- GraphXUtils() - Constructor for class org.apache.spark.graphx.GraphXUtils
-
- greater(Duration) - Method in class org.apache.spark.streaming.Duration
-
- greater(Time) - Method in class org.apache.spark.streaming.Time
-
- greaterEq(Duration) - Method in class org.apache.spark.streaming.Duration
-
- greaterEq(Time) - Method in class org.apache.spark.streaming.Time
-
- GreaterThan - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to a value
greater than value
.
- GreaterThan(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThan
-
- GreaterThanOrEqual - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to a value
greater than or equal to value
.
- GreaterThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- greatest(Column...) - Static method in class org.apache.spark.sql.functions
-
Returns the greatest value of the list of values, skipping null values.
- greatest(String, String...) - Static method in class org.apache.spark.sql.functions
-
Returns the greatest value of the list of column names, skipping null values.
- greatest(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Returns the greatest value of the list of values, skipping null values.
- greatest(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Returns the greatest value of the list of column names, skipping null values.
- gridGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Create rows
by cols
grid graph with each vertex connected to its
row+1 and col+1 neighbors.
- groupArr() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- groupBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD of grouped elements.
- groupBy(Function<T, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD of grouped elements.
- groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped items.
- groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped elements.
- groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped items.
- groupBy(Column...) - Method in class org.apache.spark.sql.Dataset
-
Groups the Dataset using the specified columns, so we can run aggregation on them.
- groupBy(String, String...) - Method in class org.apache.spark.sql.Dataset
-
Groups the Dataset using the specified columns, so that we can run aggregation on them.
- groupBy(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Groups the Dataset using the specified columns, so we can run aggregation on them.
- groupBy(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Groups the Dataset using the specified columns, so that we can run aggregation on them.
- groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(Function1<T, K>, Encoder<K>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Scala-specific)
Returns a
KeyValueGroupedDataset
where the data is grouped by the given key
func
.
- groupByKey(MapFunction<T, K>, Encoder<K>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Java-specific)
Returns a
KeyValueGroupedDataset
where the data is grouped by the given key
func
.
- groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
on each RDD of this
DStream.
- groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
on each RDD.
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Create a new DStream by applying groupByKey
over a sliding window on this
DStream.
- GroupByType$() - Constructor for class org.apache.spark.sql.RelationalGroupedDataset.GroupByType$
-
- groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.Graph
-
Merges multiple edges between two vertices into a single edge.
- groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- groupHash() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- grouping(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated
or not, returns 1 for aggregated or 0 for not aggregated in the result set.
- grouping(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated
or not, returns 1 for aggregated or 0 for not aggregated in the result set.
- grouping_id(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the level of grouping, equals to
- grouping_id(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the level of grouping, equals to
- GroupMappingServiceProvider - Interface in org.apache.spark.security
-
This Spark trait is used for mapping a given userName to a set of groups which it belongs to.
- GroupState<S> - Interface in org.apache.spark.sql.streaming
-
Experimental
- GroupStateTimeout - Class in org.apache.spark.sql.streaming
-
Represents the type of timeouts possible for the Dataset operations
`mapGroupsWithState` and `flatMapGroupsWithState`.
- GroupStateTimeout() - Constructor for class org.apache.spark.sql.streaming.GroupStateTimeout
-
- groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- gt(double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check if value is greater than lowerBound
- gt(Object) - Method in class org.apache.spark.sql.Column
-
Greater than.
- gtEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check if value is greater than or equal to lowerBound
- guard(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
-
- id() - Method in class org.apache.spark.Accumulable
-
Deprecated.
- id() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
A unique ID for this RDD (within its SparkContext).
- id() - Method in class org.apache.spark.broadcast.Broadcast
-
- id() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
-
- id() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-
- id() - Method in class org.apache.spark.mllib.tree.model.Node
-
- id() - Method in class org.apache.spark.rdd.RDD
-
A unique ID for this RDD (within its SparkContext).
- id() - Method in class org.apache.spark.scheduler.AccumulableInfo
-
- id() - Method in class org.apache.spark.scheduler.TaskInfo
-
- id() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
Returns the unique id of this query that persists across restarts from checkpoint data.
- id() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryStartedEvent
-
- id() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryTerminatedEvent
-
- id() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
- id() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
-
- id() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-
- id() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- id() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- id() - Method in class org.apache.spark.storage.RDDInfo
-
- id() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
This is a unique identifier for the input stream.
- id() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-
- id() - Method in class org.apache.spark.util.AccumulatorV2
-
Returns the id of this accumulator, can only be called after registration.
- Identifiable - Interface in org.apache.spark.ml.util
-
Developer API
- Identity$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$
-
- IDF - Class in org.apache.spark.ml.feature
-
Compute the Inverse Document Frequency (IDF) given a collection of documents.
- IDF(String) - Constructor for class org.apache.spark.ml.feature.IDF
-
- IDF() - Constructor for class org.apache.spark.ml.feature.IDF
-
- idf() - Method in class org.apache.spark.ml.feature.IDFModel
-
Returns the IDF vector.
- IDF - Class in org.apache.spark.mllib.feature
-
Inverse document frequency (IDF).
- IDF(int) - Constructor for class org.apache.spark.mllib.feature.IDF
-
- IDF() - Constructor for class org.apache.spark.mllib.feature.IDF
-
- idf() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Returns the current IDF vector.
- idf() - Method in class org.apache.spark.mllib.feature.IDFModel
-
- IDF.DocumentFrequencyAggregator - Class in org.apache.spark.mllib.feature
-
Document frequency aggregator.
- IDFBase - Interface in org.apache.spark.ml.feature
-
- IDFModel - Class in org.apache.spark.ml.feature
-
- IDFModel - Class in org.apache.spark.mllib.feature
-
Represents an IDF model that can transform term frequency vectors.
- ifPartitionNotExists() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- ImageDataSource - Class in org.apache.spark.ml.source.image
-
image
package implements Spark SQL data source API for loading image data as DataFrame
.
- ImageDataSource() - Constructor for class org.apache.spark.ml.source.image.ImageDataSource
-
- imageFields() - Static method in class org.apache.spark.ml.image.ImageSchema
-
- ImageSchema - Class in org.apache.spark.ml.image
-
Experimental
Defines the image schema and methods to read and manipulate images.
- ImageSchema() - Constructor for class org.apache.spark.ml.image.ImageSchema
-
- imageSchema() - Static method in class org.apache.spark.ml.image.ImageSchema
-
DataFrame with a single column of images named "image" (nullable)
- implicitPrefs() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param to decide whether to use implicit preference.
- implicits() - Method in class org.apache.spark.sql.SparkSession
-
Accessor for nested Scala object
- implicits() - Method in class org.apache.spark.sql.SQLContext
-
Accessor for nested Scala object
- implicits$() - Constructor for class org.apache.spark.sql.SparkSession.implicits$
-
- implicits$() - Constructor for class org.apache.spark.sql.SQLContext.implicits$
-
- improveException(Object, NotSerializableException) - Static method in class org.apache.spark.serializer.SerializationDebugger
-
Improve the given NotSerializableException with the serialization path leading from the given
object to the problematic object.
- Impurities - Class in org.apache.spark.mllib.tree.impurity
-
Factory for Impurity instances.
- Impurities() - Constructor for class org.apache.spark.mllib.tree.impurity.Impurities
-
- impurity() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
-
- impurity() - Method in class org.apache.spark.ml.tree.InternalNode
-
- impurity() - Method in class org.apache.spark.ml.tree.LeafNode
-
- impurity() - Method in class org.apache.spark.ml.tree.Node
-
Impurity measure at this node (for training data)
- impurity() - Method in interface org.apache.spark.ml.tree.TreeClassifierParams
-
Criterion used for information gain calculation (case-insensitive).
- impurity() - Method in interface org.apache.spark.ml.tree.TreeRegressorParams
-
Criterion used for information gain calculation (case-insensitive).
- impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- Impurity - Interface in org.apache.spark.mllib.tree.impurity
-
Trait for calculating information gain.
- impurity() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- impurity() - Method in class org.apache.spark.mllib.tree.model.Node
-
- impurityStats() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
-
- Imputer - Class in org.apache.spark.ml.feature
-
Experimental
Imputation estimator for completing missing values, either using the mean or the median
of the columns in which the missing values are located.
- Imputer(String) - Constructor for class org.apache.spark.ml.feature.Imputer
-
- Imputer() - Constructor for class org.apache.spark.ml.feature.Imputer
-
- ImputerModel - Class in org.apache.spark.ml.feature
-
Experimental
Model fitted by
Imputer
.
- ImputerParams - Interface in org.apache.spark.ml.feature
-
- In() - Static method in class org.apache.spark.graphx.EdgeDirection
-
Edges arriving at a vertex.
- In - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to one of the values in the array.
- In(String, Object[]) - Constructor for class org.apache.spark.sql.sources.In
-
- INACTIVE() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
-
- inArray(Object) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check for value in an allowed set of values.
- inArray(List<T>) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check for value in an allowed set of values.
- InBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.InBlock$
-
- InboxMessage - Interface in org.apache.spark.rpc.netty
-
- IncompatibleMergeException - Exception in org.apache.spark.util.sketch
-
- IncompatibleMergeException(String) - Constructor for exception org.apache.spark.util.sketch.IncompatibleMergeException
-
- incrementFetchedPartitions(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
- incrementFileCacheHits(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
- incrementFilesDiscovered(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
- incrementHiveClientCalls(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
- incrementParallelListingJobCount(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
- inDegrees() - Method in class org.apache.spark.graphx.GraphOps
-
The in-degree of each vertex in the graph.
- independence() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
-
- INDETERMINATE() - Static method in class org.apache.spark.rdd.DeterministicLevel
-
- index() - Method in class org.apache.spark.ml.attribute.Attribute
-
Index of the attribute.
- INDEX() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
-
- index() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
-
- index() - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
- index() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- index() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- index(int, int) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Return the index for the (i, j)-th element in the backing array.
- index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
-
- index(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Return the index for the (i, j)-th element in the backing array.
- index() - Method in interface org.apache.spark.Partition
-
Get the partition's index within its parent RDD
- index() - Method in class org.apache.spark.scheduler.TaskInfo
-
The index of this task within its task set.
- index() - Method in class org.apache.spark.status.api.v1.TaskData
-
- IndexedRow - Class in org.apache.spark.mllib.linalg.distributed
-
- IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
-
- IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
- IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- indexName(String) - Static method in class org.apache.spark.ui.jobs.ApiHelper
-
- indexOf(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Index of an attribute specified by name.
- indexOf(String) - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
Index of a specific value.
- indexOf(Object) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Returns the index of the input term.
- indexToLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the level of a tree which the given node is in.
- IndexToString - Class in org.apache.spark.ml.feature
-
A Transformer
that maps a column of indices back to a new column of corresponding
string values.
- IndexToString(String) - Constructor for class org.apache.spark.ml.feature.IndexToString
-
- IndexToString() - Constructor for class org.apache.spark.ml.feature.IndexToString
-
- indices() - Method in class org.apache.spark.ml.feature.VectorSlicer
-
An array of indices to select features from a vector column.
- indices() - Method in class org.apache.spark.ml.linalg.SparseVector
-
- indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- IndylambdaScalaClosures - Class in org.apache.spark.util
-
- IndylambdaScalaClosures() - Constructor for class org.apache.spark.util.IndylambdaScalaClosures
-
- inferSchema(SparkSession, Map<String, String>, Seq<FileStatus>) - Method in class org.apache.spark.sql.hive.execution.HiveFileFormat
-
- inferSchema(CatalogTable) - Static method in class org.apache.spark.sql.hive.HiveUtils
-
Infers the schema for Hive serde tables and returns the CatalogTable with the inferred schema.
- inferSchema(SparkSession, Map<String, String>, Seq<FileStatus>) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- info() - Method in class org.apache.spark.status.LiveRDD
-
- info() - Method in class org.apache.spark.status.LiveStage
-
- info() - Method in class org.apache.spark.status.LiveTask
-
- infoChanged(SparkAppHandle) - Method in interface org.apache.spark.launcher.SparkAppHandle.Listener
-
Callback for changes in any information that is not the handle's state.
- infoGain() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- InformationGainStats - Class in org.apache.spark.mllib.tree.model
-
Developer API
Information gain statistics for each split
param: gain information gain value
param: impurity current node impurity
param: leftImpurity left node impurity
param: rightImpurity right node impurity
param: leftPredict left node predict
param: rightPredict right node predict
- InformationGainStats(double, double, double, double, Predict, Predict) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
-
- init() - Method in interface org.apache.spark.ExecutorPlugin
-
Initialize the executor plugin.
- initcap(Column) - Static method in class org.apache.spark.sql.functions
-
Returns a new string column by converting the first letter of each word to uppercase.
- initDaemon(Logger) - Static method in class org.apache.spark.util.Utils
-
Utility function that should be called early in main()
for daemons to set up some common
diagnostic state.
- initHadoopOutputMetrics(TaskContext) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
-
- initialHash() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
-
- initialize(double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
-
- initialize(double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
-
- initialize(double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
-
- initialize(double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
-
- initialize(RDD<Tuple2<Object, Vector>>, LDA) - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer
-
Initializer for the optimizer.
- initialize() - Static method in class org.apache.spark.rdd.InputFileBlockHolder
-
Initializes thread local by explicitly getting the value.
- initialize(TaskScheduler, SchedulerBackend) - Method in interface org.apache.spark.scheduler.ExternalClusterManager
-
Initialize task scheduler and backend scheduler.
- initialize(MutableAggregationBuffer) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
-
Initializes the given aggregation buffer, i.e.
- Initialized() - Static method in class org.apache.spark.rdd.CheckpointState
-
- initializeLogging(boolean, boolean) - Method in interface org.apache.spark.internal.Logging
-
- initializeLogIfNecessary(boolean) - Method in interface org.apache.spark.internal.Logging
-
- initializeLogIfNecessary(boolean, boolean) - Method in interface org.apache.spark.internal.Logging
-
- initialState(RDD<Tuple2<KeyType, StateType>>) - Method in class org.apache.spark.streaming.StateSpec
-
Set the RDD containing the initial states that will be used by mapWithState
- initialState(JavaPairRDD<KeyType, StateType>) - Method in class org.apache.spark.streaming.StateSpec
-
Set the RDD containing the initial states that will be used by mapWithState
- initialValue() - Method in class org.apache.spark.partial.PartialResult
-
- initialWeights() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams
-
The initial weights of the model.
- initInputSerDe(Seq<Expression>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- initMode() - Method in interface org.apache.spark.ml.clustering.KMeansParams
-
Param for the initialization algorithm.
- initMode() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
-
Param for the initialization algorithm.
- initModel(DenseVector<Object>, Random) - Method in interface org.apache.spark.ml.ann.Layer
-
Returns the instance of the layer with random generated weights.
- initOutputFormat(JobContext) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
-
- initOutputSerDe(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- initSteps() - Method in interface org.apache.spark.ml.clustering.KMeansParams
-
Param for the number of steps for the k-means|| initialization mode.
- initWriter(TaskAttemptContext, int) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
-
- injectCheckRule(Function1<SparkSession, Function1<LogicalPlan, BoxedUnit>>) - Method in class org.apache.spark.sql.SparkSessionExtensions
-
Inject an check analysis
Rule
builder into the
SparkSession
.
- injectOptimizerRule(Function1<SparkSession, Rule<LogicalPlan>>) - Method in class org.apache.spark.sql.SparkSessionExtensions
-
- injectParser(Function2<SparkSession, ParserInterface, ParserInterface>) - Method in class org.apache.spark.sql.SparkSessionExtensions
-
- injectPlannerStrategy(Function1<SparkSession, SparkStrategy>) - Method in class org.apache.spark.sql.SparkSessionExtensions
-
- injectPostHocResolutionRule(Function1<SparkSession, Rule<LogicalPlan>>) - Method in class org.apache.spark.sql.SparkSessionExtensions
-
- injectResolutionRule(Function1<SparkSession, Rule<LogicalPlan>>) - Method in class org.apache.spark.sql.SparkSessionExtensions
-
Inject an analyzer resolution
Rule
builder into the
SparkSession
.
- InnerClosureFinder - Class in org.apache.spark.util
-
- InnerClosureFinder(Set<Class<?>>) - Constructor for class org.apache.spark.util.InnerClosureFinder
-
- innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.EdgeRDD
-
Inner joins this EdgeRDD with another EdgeRDD, assuming both are partitioned using the same
PartitionStrategy
.
- innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Inner joins this VertexRDD with an RDD containing vertex attribute pairs.
- innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.
- inPlace() - Method in interface org.apache.spark.ml.ann.Layer
-
If true, the memory is not allocated for the output of this layer.
- InProcessLauncher - Class in org.apache.spark.launcher
-
In-process launcher for Spark applications.
- InProcessLauncher() - Constructor for class org.apache.spark.launcher.InProcessLauncher
-
- input() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
-
- INPUT() - Static method in class org.apache.spark.ui.ToolTips
-
- input$() - Constructor for class org.apache.spark.InternalAccumulator.input$
-
- input_file_name() - Static method in class org.apache.spark.sql.functions
-
Creates a string column for the file name of the current Spark task.
- INPUT_FORMAT() - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- INPUT_METRICS_PREFIX() - Static method in class org.apache.spark.InternalAccumulator
-
- INPUT_RECORDS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- INPUT_SIZE() - Static method in class org.apache.spark.status.TaskIndexNames
-
- inputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- inputBytes() - Method in class org.apache.spark.status.api.v1.StageData
-
- inputCol() - Method in interface org.apache.spark.ml.param.shared.HasInputCol
-
Param for input column name.
- inputCols() - Method in interface org.apache.spark.ml.param.shared.HasInputCols
-
Param for input column names.
- inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
-
- InputDStream<T> - Class in org.apache.spark.streaming.dstream
-
This is the abstract base class for all input streams.
- InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
-
- InputFileBlockHolder - Class in org.apache.spark.rdd
-
This holds file names of the current Spark task.
- InputFileBlockHolder() - Constructor for class org.apache.spark.rdd.InputFileBlockHolder
-
- inputFiles() - Method in class org.apache.spark.sql.Dataset
-
Returns a best-effort snapshot of the files that compose this Dataset.
- inputFormat() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
-
- InputFormatInfo - Class in org.apache.spark.scheduler
-
Developer API
Parses and holds information about inputFormat (and files) specified as a parameter.
- InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
-
- InputMetricDistributions - Class in org.apache.spark.status.api.v1
-
- InputMetrics - Class in org.apache.spark.status.api.v1
-
- inputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- inputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- InputPartition<T> - Interface in org.apache.spark.sql.sources.v2.reader
-
- InputPartitionReader<T> - Interface in org.apache.spark.sql.sources.v2.reader
-
- inputRecords() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- inputRecords() - Method in class org.apache.spark.status.api.v1.StageData
-
- inputRowFormat() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- inputRowFormatMap() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- inputRowsPerSecond() - Method in class org.apache.spark.sql.streaming.SourceProgress
-
- inputRowsPerSecond() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
The aggregate (across all sources) rate of data arriving.
- inputSchema() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
-
A StructType
represents data types of input arguments of this aggregate function.
- inputSerdeClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- inputSerdeProps() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- inputSize() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
-
- inputStreamId() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
-
- inputTypes() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
-
- inRange(double, double, boolean, boolean) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check for value in range lowerBound to upperBound.
- inRange(double, double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Version of `inRange()` which uses inclusive be default: [lowerBound, upperBound]
- insert(Dataset<Row>, boolean) - Method in interface org.apache.spark.sql.sources.InsertableRelation
-
- InsertableRelation - Interface in org.apache.spark.sql.sources
-
A BaseRelation that can be used to insert data into it through the insert method.
- insertInto(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Inserts the content of the DataFrame
to the specified table.
- InsertIntoHiveDirCommand - Class in org.apache.spark.sql.hive.execution
-
Command for writing the results of query
to file system.
- InsertIntoHiveDirCommand(boolean, CatalogStorageFormat, LogicalPlan, boolean, Seq<String>) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
-
- InsertIntoHiveTable - Class in org.apache.spark.sql.hive.execution
-
Command for writing data out to a Hive table.
- InsertIntoHiveTable(CatalogTable, Map<String, Option<String>>, LogicalPlan, boolean, boolean, Seq<String>) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- inShutdown() - Static method in class org.apache.spark.util.ShutdownHookManager
-
Detect whether this thread might be executing a shutdown hook.
- inspect(Object) - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
- inspectorToDataType(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- inspectorToDataType(ObjectInspector) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- instance() - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
Get this impurity instance.
- instance() - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
Get this impurity instance.
- instance() - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
Get this impurity instance.
- INSTANCE - Static variable in class org.apache.spark.serializer.DummySerializerInstance
-
- instantiate(String, String, String, boolean) - Static method in class org.apache.spark.internal.io.FileCommitProtocol
-
Instantiates a FileCommitProtocol using the given className.
- instr(Column, String) - Static method in class org.apache.spark.sql.functions
-
Locate the position of the first occurrence of substr column in the given string.
- INT() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for nullable int type.
- intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- intAccumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
- IntAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-
Deprecated.
- IntArrayParam - Class in org.apache.spark.ml.param
-
Developer API
Specialized version of Param[Array[Int}
for Java.
- IntArrayParam(Params, String, String, Function1<int[], Object>) - Constructor for class org.apache.spark.ml.param.IntArrayParam
-
- IntArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.IntArrayParam
-
- IntegerType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the IntegerType object.
- IntegerType - Class in org.apache.spark.sql.types
-
The data type representing Int
values.
- IntegerType() - Constructor for class org.apache.spark.sql.types.IntegerType
-
- INTER_JOB_WAIT_MS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
-
- InteractableTerm - Interface in org.apache.spark.ml.feature
-
A term that may be part of an interaction, e.g.
- Interaction - Class in org.apache.spark.ml.feature
-
Implements the feature interaction transform.
- Interaction(String) - Constructor for class org.apache.spark.ml.feature.Interaction
-
- Interaction() - Constructor for class org.apache.spark.ml.feature.Interaction
-
- intercept() - Method in class org.apache.spark.ml.classification.LinearSVCModel
-
- intercept() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
The model intercept for "binomial" logistic regression.
- intercept() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- intercept() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
-
- intercept() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- intercept() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
-
- intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
-
- intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
- intercept() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
-
- intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
-
- intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
-
- intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- interceptVector() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- intermediateStorageLevel() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for StorageLevel for intermediate datasets.
- InternalAccumulator - Class in org.apache.spark
-
A collection of fields and methods concerned with internal accumulators that represent
task level metrics.
- InternalAccumulator() - Constructor for class org.apache.spark.InternalAccumulator
-
- InternalAccumulator.input$ - Class in org.apache.spark
-
- InternalAccumulator.output$ - Class in org.apache.spark
-
- InternalAccumulator.shuffleRead$ - Class in org.apache.spark
-
- InternalAccumulator.shuffleWrite$ - Class in org.apache.spark
-
- InternalKMeansModelWriter - Class in org.apache.spark.ml.clustering
-
A writer for KMeans that handles the "internal" (or default) format
- InternalKMeansModelWriter() - Constructor for class org.apache.spark.ml.clustering.InternalKMeansModelWriter
-
- InternalLinearRegressionModelWriter - Class in org.apache.spark.ml.regression
-
A writer for LinearRegression that handles the "internal" (or default) format
- InternalLinearRegressionModelWriter() - Constructor for class org.apache.spark.ml.regression.InternalLinearRegressionModelWriter
-
- InternalNode - Class in org.apache.spark.ml.tree
-
Internal Decision Tree node.
- InterruptibleIterator<T> - Class in org.apache.spark
-
Developer API
An iterator that wraps around an existing iterator to provide task killing functionality.
- InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
-
- interruptThread() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
-
- interruptThread() - Method in class org.apache.spark.scheduler.local.KillTask
-
- intersect(Dataset<T>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset containing rows only in both this Dataset and another Dataset.
- intersectAll(Dataset<T>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset containing rows only in both this Dataset and another Dataset while
preserving the duplicates.
- intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return the intersection of this RDD and another one.
- intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the intersection of this RDD and another one.
- intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return the intersection of this RDD and another one.
- intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the intersection of this RDD and another one.
- intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the intersection of this RDD and another one.
- intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
-
Return the intersection of this RDD and another one.
- intervalMs() - Method in class org.apache.spark.sql.streaming.ProcessingTime
-
Deprecated.
- IntParam - Class in org.apache.spark.ml.param
-
Developer API
Specialized version of Param[Int]
for Java.
- IntParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.IntParam
-
- IntParam(String, String, String) - Constructor for class org.apache.spark.ml.param.IntParam
-
- IntParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.IntParam
-
- IntParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.IntParam
-
- IntParam - Class in org.apache.spark.util
-
An extractor object for parsing strings into integers.
- IntParam() - Constructor for class org.apache.spark.util.IntParam
-
- invalidateSerializedMapOutputStatusCache() - Method in class org.apache.spark.ShuffleStatus
-
Clears the cached serialized map output statuses.
- inverse() - Method in class org.apache.spark.ml.feature.DCT
-
Indicates whether to perform the inverse DCT (true) or forward DCT (false).
- inverse(double[], int) - Static method in class org.apache.spark.mllib.linalg.CholeskyDecomposition
-
Computes the inverse of a real symmetric positive definite matrix A
using the Cholesky factorization A = U**T*U.
- Inverse$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$
-
- invokedMethod(Object, Class<?>, String) - Static method in class org.apache.spark.graphx.util.BytecodeUtils
-
Test whether the given closure invokes the specified method in the specified class.
- invokeWriteReplace(Object) - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- ioEncryptionKey() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig
-
- ioschema() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
-
- is32BitDecimalType(DataType) - Static method in class org.apache.spark.sql.types.DecimalType
-
Returns if dt is a DecimalType that fits inside an int
- is64BitDecimalType(DataType) - Static method in class org.apache.spark.sql.types.DecimalType
-
Returns if dt is a DecimalType that fits inside a long
- isActive() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
Returns true
if this query is actively running.
- isActive() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- isActive() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
-
- isActive() - Method in class org.apache.spark.status.LiveExecutor
-
- isAddIntercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Get if the algorithm uses addIntercept
- isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
-
- isBatchingEnabled(SparkConf, boolean) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- isBindCollision(Throwable) - Static method in class org.apache.spark.util.Utils
-
Return whether the exception is caused by an address-port collision when binding.
- isBlacklisted() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- isBlacklisted() - Method in class org.apache.spark.status.LiveExecutor
-
- isBlacklisted() - Method in class org.apache.spark.status.LiveExecutorStageSummary
-
- isBlacklistedForStage() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- isBroadcast() - Method in class org.apache.spark.storage.BlockId
-
- isBucket() - Method in class org.apache.spark.sql.catalog.Column
-
- isByteArrayDecimalType(DataType) - Static method in class org.apache.spark.sql.types.DecimalType
-
Returns if dt is a DecimalType that doesn't fit inside a long
- isCached(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Returns true if the table is currently cached in-memory.
- isCached(String) - Method in class org.apache.spark.sql.SQLContext
-
Returns true if the table is currently cached in-memory.
- isCached() - Method in class org.apache.spark.storage.BlockStatus
-
- isCached() - Method in class org.apache.spark.storage.RDDInfo
-
- isCancelled() - Method in class org.apache.spark.ComplexFutureAction
-
- isCancelled() - Method in interface org.apache.spark.FutureAction
-
Returns whether the action has been cancelled.
- isCancelled() - Method in class org.apache.spark.SimpleFutureAction
-
- isCascadingTruncateTable() - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
- isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
-
- isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
-
- isCascadingTruncateTable() - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
Return Some[true] iff TRUNCATE TABLE
causes cascading default.
- isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return whether this RDD has been checkpointed or not
- isCheckpointed() - Method in class org.apache.spark.graphx.Graph
-
Return whether this Graph has been checkpointed or not.
- isCheckpointed() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- isCheckpointed() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- isCheckpointed() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- isCheckpointed() - Method in class org.apache.spark.rdd.RDD
-
Return whether this RDD is checkpointed and materialized, either reliably or locally.
- isCliSessionState() - Static method in class org.apache.spark.sql.hive.HiveUtils
-
Check current Thread's SessionState type
- isColMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Indicates whether the values backing this matrix are arranged in column major order.
- isCompatible(BloomFilter) - Method in class org.apache.spark.util.sketch.BloomFilter
-
Determines whether a given bloom filter is compatible with this bloom filter.
- isCompleted() - Method in class org.apache.spark.BarrierTaskContext
-
- isCompleted() - Method in class org.apache.spark.ComplexFutureAction
-
- isCompleted() - Method in interface org.apache.spark.FutureAction
-
Returns whether the action has already been completed with a value or an exception.
- isCompleted() - Method in class org.apache.spark.SimpleFutureAction
-
- isCompleted() - Method in class org.apache.spark.TaskContext
-
Returns true if the task has completed.
- isDataAvailable() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus
-
- isDefined(Param<?>) - Method in interface org.apache.spark.ml.param.Params
-
Checks whether a param is explicitly set or has a default value.
- isDistributed() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
-
- isDistributed() - Method in class org.apache.spark.ml.clustering.LDAModel
-
- isDistributed() - Method in class org.apache.spark.ml.clustering.LocalLDAModel
-
- isDriver() - Method in class org.apache.spark.storage.BlockManagerId
-
- isDynamicAllocationEnabled(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Return whether dynamic allocation is enabled in the given conf.
- isEmpty() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- isEmpty() - Method in class org.apache.spark.rdd.RDD
-
- isEmpty() - Method in class org.apache.spark.sql.Dataset
-
Returns true if the Dataset
is empty.
- isExecutorStartupConf(String) - Static method in class org.apache.spark.SparkConf
-
Return whether the given config should be passed to an executor on start-up.
- isExperiment() - Method in class org.apache.spark.mllib.stat.test.BinarySample
-
- isFailed(Enumeration.Value) - Static method in class org.apache.spark.TaskState
-
- isFatalError(Throwable) - Static method in class org.apache.spark.util.Utils
-
Returns true if the given exception was fatal.
- isFile(Path) - Static method in class org.apache.spark.ml.image.SamplePathFilter
-
- isFinal() - Method in enum org.apache.spark.launcher.SparkAppHandle.State
-
Whether this state is a final state, meaning the application is not running anymore
once it's reached.
- isFinished(Enumeration.Value) - Static method in class org.apache.spark.TaskState
-
- isIgnorableException(Throwable) - Method in interface org.apache.spark.util.ListenerBus
-
Allows bus implementations to prevent error logging for certain exceptions.
- isin(Object...) - Method in class org.apache.spark.sql.Column
-
A boolean expression that is evaluated to true if the value of this expression is contained
by the evaluated values of the arguments.
- isin(Seq<Object>) - Method in class org.apache.spark.sql.Column
-
A boolean expression that is evaluated to true if the value of this expression is contained
by the evaluated values of the arguments.
- isInCollection(Iterable<?>) - Method in class org.apache.spark.sql.Column
-
A boolean expression that is evaluated to true if the value of this expression is contained
by the provided collection.
- isInCollection(Iterable<?>) - Method in class org.apache.spark.sql.Column
-
A boolean expression that is evaluated to true if the value of this expression is contained
by the provided collection.
- isInDirectory(File, File) - Static method in class org.apache.spark.util.Utils
-
Return whether the specified file is a parent directory of the child file.
- isIndylambdaScalaClosure(SerializedLambda) - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
- isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
-
- isInnerClassCtorCapturingOuter(int, String, String, String) - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
Check if the callee of a call site is a inner class constructor.
- isInterrupted() - Method in class org.apache.spark.BarrierTaskContext
-
- isInterrupted() - Method in class org.apache.spark.TaskContext
-
Returns true if the task has been killed.
- isLambdaBodyCapturingOuter(Handle, String) - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
Check if the handle represents a target method that is:
- a STATIC method that implements a Scala lambda body in the indylambda style
- captures the enclosing this
, i.e.
- isLambdaMetafactory(Handle) - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
Check if the handle represents the LambdaMetafactory that indylambda Scala closures
use for creating the lambda class and getting a closure instance.
- isLargerBetter() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- isLargerBetter() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- isLargerBetter() - Method in class org.apache.spark.ml.evaluation.Evaluator
-
Indicates whether the metric returned by evaluate
should be maximized (true, default)
or minimized (false).
- isLargerBetter() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- isLargerBetter() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- isLeaf() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
-
- isLeftChild(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Returns true if this is a left child.
- isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- isLocal() - Method in class org.apache.spark.SparkContext
-
- isLocal() - Method in class org.apache.spark.sql.Dataset
-
Returns true if the collect
and take
methods can be run locally
(without any Spark executors).
- isLocal() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
-
- isLocalMaster(SparkConf) - Static method in class org.apache.spark.util.Utils
-
- isMac() - Static method in class org.apache.spark.util.Utils
-
Whether the underlying operating system is Mac OS X.
- isModifiable(String) - Method in class org.apache.spark.sql.RuntimeConfig
-
Indicates whether the configuration property with the given key
is modifiable in the current session.
- isMulticlassClassification() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
-
- isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
-
- isNaN() - Method in class org.apache.spark.sql.Column
-
True if the current expression is NaN.
- isnan(Column) - Static method in class org.apache.spark.sql.functions
-
Return true iff the column is NaN.
- isNominal() - Method in class org.apache.spark.ml.attribute.Attribute
-
- isNominal() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
-
- isNominal() - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
- isNominal() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- isNominal() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- isNotNull() - Method in class org.apache.spark.sql.Column
-
True if the current expression is NOT null.
- IsNotNull - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to a non-null value.
- IsNotNull(String) - Constructor for class org.apache.spark.sql.sources.IsNotNull
-
- isNull() - Method in class org.apache.spark.sql.Column
-
True if the current expression is null.
- isnull(Column) - Static method in class org.apache.spark.sql.functions
-
Return true iff the column is null.
- IsNull - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to null.
- IsNull(String) - Constructor for class org.apache.spark.sql.sources.IsNull
-
- isNullAt(int) - Method in interface org.apache.spark.sql.Row
-
Checks whether the value at position i is null.
- isNullAt(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
-
- isNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- isNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- isNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector
-
Returns whether the value at rowId is NULL.
- isNumeric() - Method in class org.apache.spark.ml.attribute.Attribute
-
- isNumeric() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
-
- isNumeric() - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
- isNumeric() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- isNumeric() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- isOpen() - Method in class org.apache.spark.security.CryptoStreamUtils.ErrorHandlingReadableChannel
-
- isOpen() - Method in class org.apache.spark.storage.CountingWritableChannel
-
- isOrdinal() - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
- isotonic() - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase
-
Param for whether the output sequence should be isotonic/increasing (true) or
antitonic/decreasing (false).
- isotonic() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- IsotonicRegression - Class in org.apache.spark.ml.regression
-
Isotonic regression.
- IsotonicRegression(String) - Constructor for class org.apache.spark.ml.regression.IsotonicRegression
-
- IsotonicRegression() - Constructor for class org.apache.spark.ml.regression.IsotonicRegression
-
- IsotonicRegression - Class in org.apache.spark.mllib.regression
-
Isotonic regression.
- IsotonicRegression() - Constructor for class org.apache.spark.mllib.regression.IsotonicRegression
-
Constructs IsotonicRegression instance with default parameter isotonic = true.
- IsotonicRegressionBase - Interface in org.apache.spark.ml.regression
-
Params for isotonic regression.
- IsotonicRegressionModel - Class in org.apache.spark.ml.regression
-
Model fitted by IsotonicRegression.
- IsotonicRegressionModel - Class in org.apache.spark.mllib.regression
-
Regression model for isotonic regression.
- IsotonicRegressionModel(double[], double[], boolean) - Constructor for class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- IsotonicRegressionModel(Iterable<Object>, Iterable<Object>, Boolean) - Constructor for class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
A Java-friendly constructor that takes two Iterable parameters and one Boolean parameter.
- isOutputSpecValidationEnabled(SparkConf) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
-
- isPartition() - Method in class org.apache.spark.sql.catalog.Column
-
- isPresent() - Method in class org.apache.spark.api.java.Optional
-
- isRDD() - Method in class org.apache.spark.storage.BlockId
-
- isReady() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- isRegistered() - Method in class org.apache.spark.util.AccumulatorV2
-
Returns true if this accumulator has been registered.
- isRInstalled() - Static method in class org.apache.spark.api.r.RUtils
-
Check if R is installed before running tests that use R commands.
- isRowMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Indicates whether the values backing this matrix are arranged in row major order.
- isRunningLocally() - Method in class org.apache.spark.BarrierTaskContext
-
- isRunningLocally() - Method in class org.apache.spark.TaskContext
-
- isSet(Param<?>) - Method in interface org.apache.spark.ml.param.Params
-
Checks whether a param is explicitly set.
- isShuffle() - Method in class org.apache.spark.storage.BlockId
-
- isSparkPortConf(String) - Static method in class org.apache.spark.SparkConf
-
Return true if the given config matches either spark.*.port
or spark.port.*
.
- isSparkRInstalled() - Static method in class org.apache.spark.api.r.RUtils
-
Check if SparkR is installed before running tests that use SparkR.
- isSplitable(SparkSession, Map<String, String>, Path) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Check if the receiver has started or not.
- isStopped() - Method in class org.apache.spark.SparkContext
-
- isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Check if receiver has been marked for stopping.
- isStreaming() - Method in class org.apache.spark.sql.Dataset
-
Returns true if this Dataset contains one or more sources that continuously
return data as it arrives.
- isSubClassOf(Type, Class<?>) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- isTemporary() - Method in class org.apache.spark.sql.catalog.Function
-
- isTemporary() - Method in class org.apache.spark.sql.catalog.Table
-
- isTesting() - Static method in class org.apache.spark.util.Utils
-
Indicates whether Spark is currently running unit tests.
- isTimingOut() - Method in class org.apache.spark.streaming.State
-
Whether the state is timing out and going to be removed by the system after the current batch.
- isTraceEnabled() - Method in interface org.apache.spark.internal.Logging
-
- isTransposed() - Method in class org.apache.spark.ml.linalg.DenseMatrix
-
- isTransposed() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Flag that keeps track whether the matrix is transposed or not.
- isTransposed() - Method in class org.apache.spark.ml.linalg.SparseMatrix
-
- isTransposed() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- isTransposed() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Flag that keeps track whether the matrix is transposed or not.
- isTransposed() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- isTriggerActive() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus
-
- isValid() - Method in class org.apache.spark.ml.param.Param
-
- isValid() - Method in class org.apache.spark.storage.StorageLevel
-
- isWindows() - Static method in class org.apache.spark.util.Utils
-
Whether the underlying operating system is Windows.
- isZero() - Method in class org.apache.spark.sql.types.Decimal
-
- isZero() - Method in class org.apache.spark.streaming.Duration
-
- isZero() - Method in class org.apache.spark.util.AccumulatorV2
-
Returns if this accumulator is zero value or not.
- isZero() - Method in class org.apache.spark.util.CollectionAccumulator
-
Returns false if this accumulator instance has any values in it.
- isZero() - Method in class org.apache.spark.util.DoubleAccumulator
-
Returns false if this accumulator has had any values added to it or the sum is non-zero.
- isZero() - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
-
- isZero() - Method in class org.apache.spark.util.LongAccumulator
-
Returns false if this accumulator has had any values added to it or the sum is non-zero.
- item() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
-
- itemCol() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams
-
Param for the column name for item ids.
- itemFactors() - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- items() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-
- itemsCol() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
-
Items column name.
- itemSupport() - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
-
- iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
- iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
-
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
- iterator() - Method in class org.apache.spark.sql.types.StructType
-
- iterator() - Method in class org.apache.spark.status.RDDPartitionSeq
-
- IV_LENGTH_IN_BYTES() - Static method in class org.apache.spark.security.CryptoStreamUtils
-
- L1Updater - Class in org.apache.spark.mllib.optimization
-
Developer API
Updater for L1 regularized problems.
- L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
-
- label() - Method in class org.apache.spark.ml.feature.LabeledPoint
-
- label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- labelCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Field in "predictions" which gives the true label of each instance (if available).
- labelCol() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
-
- labelCol() - Method in interface org.apache.spark.ml.param.shared.HasLabelCol
-
Param for label column name.
- labelCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
- LabeledPoint - Class in org.apache.spark.ml.feature
-
Class that represents the features and label of a data point.
- LabeledPoint(double, Vector) - Constructor for class org.apache.spark.ml.feature.LabeledPoint
-
- LabeledPoint - Class in org.apache.spark.mllib.regression
-
Class that represents the features and labels of a data point.
- LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
-
- LabelPropagation - Class in org.apache.spark.graphx.lib
-
Label Propagation algorithm.
- LabelPropagation() - Constructor for class org.apache.spark.graphx.lib.LabelPropagation
-
- labels() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Returns the sequence of labels in ascending order.
- labels() - Method in class org.apache.spark.ml.feature.IndexToString
-
Optional param for array of labels specifying index-string mapping.
- labels() - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
-
- labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
-
- labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns the sequence of labels in ascending order
- labels() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns the sequence of labels in ascending order
- lag(Column, int) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows before the current row, and
null
if there is less than offset
rows before the current row.
- lag(String, int) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows before the current row, and
null
if there is less than offset
rows before the current row.
- lag(String, int, Object) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows before the current row, and
defaultValue
if there is less than offset
rows before the current row.
- lag(Column, int, Object) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows before the current row, and
defaultValue
if there is less than offset
rows before the current row.
- LambdaMetafactoryClassName() - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
- LambdaMetafactoryMethodDesc() - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
- LambdaMetafactoryMethodName() - Static method in class org.apache.spark.util.IndylambdaScalaClosures
-
- LassoModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using Lasso.
- LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
-
- LassoWithSGD - Class in org.apache.spark.mllib.regression
-
Train a regression model with L1-regularization using Stochastic Gradient Descent.
- LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD
-
- last(Column, boolean) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the last value in a group.
- last(String, boolean) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the last value of the column in a group.
- last(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the last value in a group.
- last(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the last value of the column in a group.
- last_day(Column) - Static method in class org.apache.spark.sql.functions
-
Returns the last day of the month which the given date belongs to.
- lastDir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- lastError() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
-
- lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- lastErrorMessage() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
-
- lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- lastErrorTime() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
-
- lastErrorTime() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- lastProgress() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
- lastStageNameAndDescription(org.apache.spark.status.AppStatusStore, JobData) - Static method in class org.apache.spark.ui.jobs.ApiHelper
-
- lastUpdate() - Method in class org.apache.spark.status.LiveRDDDistribution
-
- lastUpdated() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- Latest() - Constructor for class org.apache.spark.streaming.kinesis.KinesisInitialPositions.Latest
-
- latestModel() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Return the latest model.
- latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Return the latest model.
- launch() - Method in class org.apache.spark.launcher.SparkLauncher
-
Launches a sub-process that will start the configured Spark application.
- LAUNCH_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
-
- LAUNCHING() - Static method in class org.apache.spark.TaskState
-
- LaunchTask(org.apache.spark.util.SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
-
- LaunchTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
-
- launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
- launchTime() - Method in class org.apache.spark.status.api.v1.TaskData
-
- Layer - Interface in org.apache.spark.ml.ann
-
Trait that holds Layer properties, that are needed to instantiate it.
- LayerModel - Interface in org.apache.spark.ml.ann
-
Trait that holds Layer weights (or parameters).
- layerModels() - Method in interface org.apache.spark.ml.ann.TopologyModel
-
Array of layer models
- layers() - Method in interface org.apache.spark.ml.ann.TopologyModel
-
Array of layers
- layers() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
-
- layers() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams
-
Layer sizes including input size and output size.
- LBFGS - Class in org.apache.spark.mllib.optimization
-
Developer API
Class used to solve an optimization problem using Limited-memory BFGS.
- LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
-
- LDA - Class in org.apache.spark.ml.clustering
-
Latent Dirichlet Allocation (LDA), a topic model designed for text documents.
- LDA(String) - Constructor for class org.apache.spark.ml.clustering.LDA
-
- LDA() - Constructor for class org.apache.spark.ml.clustering.LDA
-
- LDA - Class in org.apache.spark.mllib.clustering
-
Latent Dirichlet Allocation (LDA), a topic model designed for text documents.
- LDA() - Constructor for class org.apache.spark.mllib.clustering.LDA
-
Constructs a LDA instance with default parameters.
- LDAModel - Class in org.apache.spark.ml.clustering
-
- LDAModel - Class in org.apache.spark.mllib.clustering
-
Latent Dirichlet Allocation (LDA) model.
- LDAOptimizer - Interface in org.apache.spark.mllib.clustering
-
Developer API
- LDAParams - Interface in org.apache.spark.ml.clustering
-
- LDAUtils - Class in org.apache.spark.mllib.clustering
-
Utility methods for LDA.
- LDAUtils() - Constructor for class org.apache.spark.mllib.clustering.LDAUtils
-
- lead(String, int) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows after the current row, and
null
if there is less than offset
rows after the current row.
- lead(Column, int) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows after the current row, and
null
if there is less than offset
rows after the current row.
- lead(String, int, Object) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows after the current row, and
defaultValue
if there is less than offset
rows after the current row.
- lead(Column, int, Object) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows after the current row, and
defaultValue
if there is less than offset
rows after the current row.
- LeafNode - Class in org.apache.spark.ml.tree
-
Decision tree leaf node.
- learningDecay() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
For Online optimizer only: optimizer
= "online".
- learningOffset() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
For Online optimizer only: optimizer
= "online".
- learningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- least(Column...) - Static method in class org.apache.spark.sql.functions
-
Returns the least value of the list of values, skipping null values.
- least(String, String...) - Static method in class org.apache.spark.sql.functions
-
Returns the least value of the list of column names, skipping null values.
- least(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Returns the least value of the list of values, skipping null values.
- least(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Returns the least value of the list of column names, skipping null values.
- LeastSquaresGradient - Class in org.apache.spark.mllib.optimization
-
Developer API
Compute gradient and loss for a Least-squared loss function, as used in linear regression.
- LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- left() - Method in class org.apache.spark.sql.sources.And
-
- left() - Method in class org.apache.spark.sql.sources.Or
-
- leftCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit
-
Get sorted categories which split to the left
- leftCategoriesOrThreshold() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData
-
- leftChild() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
-
- leftChild() - Method in class org.apache.spark.ml.tree.InternalNode
-
- leftChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the index of the left child of this node.
- leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
-
Left joins this VertexRDD with an RDD containing vertex attribute pairs.
- leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
-
- leftNodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a left outer join of this
and other
.
- leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a left outer join of this
and other
.
- leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a left outer join of this
and other
.
- leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a left outer join of this
and other
.
- leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a left outer join of this
and other
.
- leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a left outer join of this
and other
.
- leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
-
Left joins this RDD with another VertexRDD with the same index.
- LegacyAccumulatorWrapper<R,T> - Class in org.apache.spark.util
-
- LegacyAccumulatorWrapper(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.util.LegacyAccumulatorWrapper
-
- length() - Method in class org.apache.spark.scheduler.SplitInfo
-
- length(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the character length of a given string or number of bytes of a binary string.
- length() - Method in interface org.apache.spark.sql.Row
-
Number of elements in the Row.
- length() - Method in class org.apache.spark.sql.types.CharType
-
- length() - Method in class org.apache.spark.sql.types.HiveStringType
-
- length() - Method in class org.apache.spark.sql.types.StructType
-
- length() - Method in class org.apache.spark.sql.types.VarcharType
-
- length() - Method in class org.apache.spark.status.RDDPartitionSeq
-
- leq(Object) - Method in class org.apache.spark.sql.Column
-
Less than or equal to.
- less(Duration) - Method in class org.apache.spark.streaming.Duration
-
- less(Time) - Method in class org.apache.spark.streaming.Time
-
- lessEq(Duration) - Method in class org.apache.spark.streaming.Duration
-
- lessEq(Time) - Method in class org.apache.spark.streaming.Time
-
- LessThan - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to a value
less than value
.
- LessThan(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThan
-
- LessThanOrEqual - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to a value
less than or equal to value
.
- LessThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThanOrEqual
-
- levenshtein(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Computes the Levenshtein distance of the two given string columns.
- libraryPathEnvName() - Static method in class org.apache.spark.util.Utils
-
Return the current system LD_LIBRARY_PATH name
- libraryPathEnvPrefix(Seq<String>) - Static method in class org.apache.spark.util.Utils
-
Return the prefix of a command that appends the given library paths to the
system-specific library path environment variable.
- LibSVMDataSource - Class in org.apache.spark.ml.source.libsvm
-
libsvm
package implements Spark SQL data source API for loading LIBSVM data as DataFrame
.
- LibSVMDataSource() - Constructor for class org.apache.spark.ml.source.libsvm.LibSVMDataSource
-
- lift() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
-
Returns the lift of the rule.
- like(String) - Method in class org.apache.spark.sql.Column
-
SQL like expression.
- limit(int) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset by taking the first n
rows.
- line() - Method in exception org.apache.spark.sql.AnalysisException
-
- LinearDataGenerator - Class in org.apache.spark.mllib.util
-
Developer API
Generate sample data used for Linear Data.
- LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
-
- LinearRegression - Class in org.apache.spark.ml.regression
-
Linear regression.
- LinearRegression(String) - Constructor for class org.apache.spark.ml.regression.LinearRegression
-
- LinearRegression() - Constructor for class org.apache.spark.ml.regression.LinearRegression
-
- LinearRegressionModel - Class in org.apache.spark.ml.regression
-
- LinearRegressionModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using LinearRegression.
- LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
-
- LinearRegressionParams - Interface in org.apache.spark.ml.regression
-
Params for linear regression.
- LinearRegressionSummary - Class in org.apache.spark.ml.regression
-
Experimental
Linear regression results evaluated on a dataset.
- LinearRegressionTrainingSummary - Class in org.apache.spark.ml.regression
-
Experimental
Linear regression training results.
- LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
Train a linear regression model with no regularization using Stochastic Gradient Descent.
- LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
- LinearSVC - Class in org.apache.spark.ml.classification
-
Experimental
- LinearSVC(String) - Constructor for class org.apache.spark.ml.classification.LinearSVC
-
- LinearSVC() - Constructor for class org.apache.spark.ml.classification.LinearSVC
-
- LinearSVCModel - Class in org.apache.spark.ml.classification
-
Experimental
Linear SVM Model trained by
LinearSVC
- LinearSVCParams - Interface in org.apache.spark.ml.classification
-
Params for linear SVM Classifier.
- link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$
-
- link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$
-
- link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$
-
- link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$
-
- link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Logit$
-
- link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Probit$
-
- link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$
-
- link() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
Param for the name of link function which provides the relationship
between the linear predictor and the mean of the distribution function.
- Link$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Link$
-
- linkPower() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
Param for the index in the power link function.
- linkPredictionCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
Param for link prediction (linear predictor) column name.
- listColumns(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Returns a list of columns for the given table/view or temporary view.
- listColumns(String, String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Returns a list of columns for the given table/view in the specified database.
- listDatabases() - Method in class org.apache.spark.sql.catalog.Catalog
-
Returns a list of databases available across all sessions.
- listDatabases(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
List the names of all the databases that match the specified pattern.
- ListenerBus<L,E> - Interface in org.apache.spark.util
-
An event bus which posts events to its listeners.
- listenerManager() - Method in class org.apache.spark.sql.SparkSession
-
- listenerManager() - Method in class org.apache.spark.sql.SQLContext
-
- listeners() - Method in interface org.apache.spark.util.ListenerBus
-
- listFiles() - Method in class org.apache.spark.SparkContext
-
Returns a list of file paths that are added to resources.
- listFunctions() - Method in class org.apache.spark.sql.catalog.Catalog
-
Returns a list of functions registered in the current database.
- listFunctions(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Returns a list of functions registered in the specified database.
- listFunctions(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Return the names of all functions that match the given pattern in the database.
- listingTable(Seq<String>, Function1<T, Seq<Node>>, Iterable<T>, boolean, Option<String>, Seq<String>, boolean, boolean) - Static method in class org.apache.spark.ui.UIUtils
-
Returns an HTML table constructed by generating a row for each object in a sequence.
- listJars() - Method in class org.apache.spark.SparkContext
-
Returns a list of jar files that are added to resources.
- listOrcFiles(String, Configuration) - Static method in class org.apache.spark.sql.hive.orc.OrcFileOperator
-
- listTables() - Method in class org.apache.spark.sql.catalog.Catalog
-
Returns a list of tables/views in the current database.
- listTables(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Returns a list of tables/views in the specified database.
- listTables(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the names of all tables in the given database.
- listTables(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Returns the names of tables in the given database that matches the given pattern.
- lit(Object) - Static method in class org.apache.spark.sql.functions
-
Creates a
Column
of literal value.
- literal(String) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- LIVE_ENTITY_UPDATE_MIN_FLUSH_PERIOD() - Static method in class org.apache.spark.status.config
-
- LIVE_ENTITY_UPDATE_PERIOD() - Static method in class org.apache.spark.status.config
-
- LiveEntityHelpers - Class in org.apache.spark.status
-
- LiveEntityHelpers() - Constructor for class org.apache.spark.status.LiveEntityHelpers
-
- LiveExecutor - Class in org.apache.spark.status
-
- LiveExecutor(String, long) - Constructor for class org.apache.spark.status.LiveExecutor
-
- LiveExecutorStageSummary - Class in org.apache.spark.status
-
- LiveExecutorStageSummary(int, int, String) - Constructor for class org.apache.spark.status.LiveExecutorStageSummary
-
- LiveJob - Class in org.apache.spark.status
-
- LiveJob(int, String, Option<String>, Option<Date>, Seq<Object>, Option<String>, int) - Constructor for class org.apache.spark.status.LiveJob
-
- LiveRDD - Class in org.apache.spark.status
-
- LiveRDD(RDDInfo) - Constructor for class org.apache.spark.status.LiveRDD
-
- LiveRDDDistribution - Class in org.apache.spark.status
-
- LiveRDDDistribution(LiveExecutor) - Constructor for class org.apache.spark.status.LiveRDDDistribution
-
- LiveRDDPartition - Class in org.apache.spark.status
-
- LiveRDDPartition(String) - Constructor for class org.apache.spark.status.LiveRDDPartition
-
- LiveStage - Class in org.apache.spark.status
-
- LiveStage() - Constructor for class org.apache.spark.status.LiveStage
-
- LiveTask - Class in org.apache.spark.status
-
- LiveTask(TaskInfo, int, int, Option<Object>) - Constructor for class org.apache.spark.status.LiveTask
-
- load(String) - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- load(String) - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- load(String) - Static method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- load(String) - Static method in class org.apache.spark.ml.classification.GBTClassifier
-
- load(String) - Static method in class org.apache.spark.ml.classification.LinearSVC
-
- load(String) - Static method in class org.apache.spark.ml.classification.LinearSVCModel
-
- load(String) - Static method in class org.apache.spark.ml.classification.LogisticRegression
-
- load(String) - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- load(String) - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
-
- load(String) - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
- load(String) - Static method in class org.apache.spark.ml.classification.NaiveBayes
-
- load(String) - Static method in class org.apache.spark.ml.classification.NaiveBayesModel
-
- load(String) - Static method in class org.apache.spark.ml.classification.OneVsRest
-
- load(String) - Static method in class org.apache.spark.ml.classification.OneVsRestModel
-
- load(String) - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- load(String) - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- load(String) - Static method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- load(String) - Static method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
- load(String) - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
-
- load(String) - Static method in class org.apache.spark.ml.clustering.GaussianMixture
-
- load(String) - Static method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
- load(String) - Static method in class org.apache.spark.ml.clustering.KMeans
-
- load(String) - Static method in class org.apache.spark.ml.clustering.KMeansModel
-
- load(String) - Static method in class org.apache.spark.ml.clustering.LDA
-
- load(String) - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
-
- load(String) - Static method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
- load(String) - Static method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- load(String) - Static method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- load(String) - Static method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- load(String) - Static method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- load(String) - Static method in class org.apache.spark.ml.feature.Binarizer
-
- load(String) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
-
- load(String) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.Bucketizer
-
- load(String) - Static method in class org.apache.spark.ml.feature.ChiSqSelector
-
- load(String) - Static method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.ColumnPruner
-
- load(String) - Static method in class org.apache.spark.ml.feature.CountVectorizer
-
- load(String) - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.DCT
-
- load(String) - Static method in class org.apache.spark.ml.feature.ElementwiseProduct
-
- load(String) - Static method in class org.apache.spark.ml.feature.FeatureHasher
-
- load(String) - Static method in class org.apache.spark.ml.feature.HashingTF
-
- load(String) - Static method in class org.apache.spark.ml.feature.IDF
-
- load(String) - Static method in class org.apache.spark.ml.feature.IDFModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.Imputer
-
- load(String) - Static method in class org.apache.spark.ml.feature.ImputerModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.IndexToString
-
- load(String) - Static method in class org.apache.spark.ml.feature.Interaction
-
- load(String) - Static method in class org.apache.spark.ml.feature.MaxAbsScaler
-
- load(String) - Static method in class org.apache.spark.ml.feature.MaxAbsScalerModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.MinHashLSH
-
- load(String) - Static method in class org.apache.spark.ml.feature.MinHashLSHModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.MinMaxScaler
-
- load(String) - Static method in class org.apache.spark.ml.feature.MinMaxScalerModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.NGram
-
- load(String) - Static method in class org.apache.spark.ml.feature.Normalizer
-
- load(String) - Static method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
- load(String) - Static method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
-
- load(String) - Static method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.PCA
-
- load(String) - Static method in class org.apache.spark.ml.feature.PCAModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.PolynomialExpansion
-
- load(String) - Static method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- load(String) - Static method in class org.apache.spark.ml.feature.RegexTokenizer
-
- load(String) - Static method in class org.apache.spark.ml.feature.RFormula
-
- load(String) - Static method in class org.apache.spark.ml.feature.RFormulaModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.SQLTransformer
-
- load(String) - Static method in class org.apache.spark.ml.feature.StandardScaler
-
- load(String) - Static method in class org.apache.spark.ml.feature.StandardScalerModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.StopWordsRemover
-
- load(String) - Static method in class org.apache.spark.ml.feature.StringIndexer
-
- load(String) - Static method in class org.apache.spark.ml.feature.StringIndexerModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.Tokenizer
-
- load(String) - Static method in class org.apache.spark.ml.feature.VectorAssembler
-
- load(String) - Static method in class org.apache.spark.ml.feature.VectorAttributeRewriter
-
- load(String) - Static method in class org.apache.spark.ml.feature.VectorIndexer
-
- load(String) - Static method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- load(String) - Static method in class org.apache.spark.ml.feature.VectorSizeHint
-
- load(String) - Static method in class org.apache.spark.ml.feature.VectorSlicer
-
- load(String) - Static method in class org.apache.spark.ml.feature.Word2Vec
-
- load(String) - Static method in class org.apache.spark.ml.feature.Word2VecModel
-
- load(String) - Static method in class org.apache.spark.ml.fpm.FPGrowth
-
- load(String) - Static method in class org.apache.spark.ml.fpm.FPGrowthModel
-
- load(String) - Static method in class org.apache.spark.ml.Pipeline
-
- load(String, SparkContext, String) - Method in class org.apache.spark.ml.Pipeline.SharedReadWrite$
-
- load(String) - Static method in class org.apache.spark.ml.PipelineModel
-
- load(String) - Static method in class org.apache.spark.ml.r.RWrappers
-
- load(String) - Static method in class org.apache.spark.ml.recommendation.ALS
-
- load(String) - Static method in class org.apache.spark.ml.recommendation.ALSModel
-
- load(String) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- load(String) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- load(String) - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- load(String) - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- load(String) - Static method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- load(String) - Static method in class org.apache.spark.ml.regression.GBTRegressor
-
- load(String) - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
- load(String) - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
-
- load(String) - Static method in class org.apache.spark.ml.regression.IsotonicRegression
-
- load(String) - Static method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-
- load(String) - Static method in class org.apache.spark.ml.regression.LinearRegression
-
- load(String) - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- load(String) - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- load(String) - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- load(String) - Static method in class org.apache.spark.ml.tuning.CrossValidator
-
- load(String) - Static method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- load(String) - Static method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- load(String) - Static method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
-
- load(String) - Method in interface org.apache.spark.ml.util.MLReadable
-
Reads an ML instance from the input path, a shortcut of read.load(path)
.
- load(String) - Method in class org.apache.spark.ml.util.MLReader
-
Loads the ML component from the input path.
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.SVMModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV2_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.KMeansModel
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV2_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.feature.Word2VecModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.fpm.FPGrowthModel
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.fpm.PrefixSpanModel
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Load a model from the given path.
- load(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LassoModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LinearRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- load(SparkContext, String, String, int) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- load(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Loader
-
Load a model from the given path.
- load(String...) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads input in as a DataFrame
, for data sources that support multiple paths.
- load() - Method in class org.apache.spark.sql.DataFrameReader
-
Loads input in as a DataFrame
, for data sources that don't require a path (e.g.
- load(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads input in as a DataFrame
, for data sources that require a path (e.g.
- load(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads input in as a DataFrame
, for data sources that support multiple paths.
- load(String) - Method in class org.apache.spark.sql.SQLContext
-
- load(String, String) - Method in class org.apache.spark.sql.SQLContext
-
- load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- load() - Method in class org.apache.spark.sql.streaming.DataStreamReader
-
Loads input data stream in as a DataFrame
, for data streams that don't require a path
(e.g.
- load(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader
-
Loads input in as a DataFrame
, for data streams that read from some path.
- loadClass(String, boolean) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
-
- loadClass(String, boolean) - Method in class org.apache.spark.util.ParentClassLoader
-
- loadData(SparkContext, String, String) - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
Helper method for loading GLM classification model data.
- loadData(SparkContext, String, String, int) - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
Helper method for loading GLM regression model data.
- loadDefaultSparkProperties(SparkConf, String) - Static method in class org.apache.spark.util.Utils
-
Load default Spark properties from the given file.
- loadDefaultStopWords(String) - Static method in class org.apache.spark.ml.feature.StopWordsRemover
-
Loads the default stop words for the given language.
- loadDynamicPartitions(String, String, String, LinkedHashMap<String, String>, boolean, int) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Loads new dynamic partitions into an existing table.
- Loader<M extends Saveable> - Interface in org.apache.spark.mllib.util
-
Developer API
- loadExtensions(Class<T>, Seq<String>, SparkConf) - Static method in class org.apache.spark.util.Utils
-
Create instances of extension classes.
- loadImpl(String, SparkSession, String, String) - Static method in class org.apache.spark.ml.tree.EnsembleModelReadWrite
-
Helper method for loading a tree ensemble from disk.
- loadImpl(Dataset<Row>, Item, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$
-
- loadImpl(Dataset<Row>, Item, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$
-
- loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile
.
- loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile
with the default number of
partitions.
- loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
- loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of
partitions.
- loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of
features determined automatically and the default number of partitions.
- loadPartition(String, String, String, LinkedHashMap<String, String>, boolean, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Loads a static partition into an existing table.
- loadTable(String, String, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Loads data into an existing table.
- loadTreeNodes(String, org.apache.spark.ml.util.DefaultParamsReader.Metadata, SparkSession) - Static method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite
-
Load a decision tree from a file.
- loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads vectors saved using RDD[Vector].saveAsTextFile
.
- loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads vectors saved using RDD[Vector].saveAsTextFile
with the default number of partitions.
- LOCAL_BLOCKS_FETCHED() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
-
- LOCAL_BYTES_READ() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
-
- LOCAL_CLUSTER_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-
- LOCAL_N_FAILURES_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-
- LOCAL_N_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-
- localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- localBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- localCanonicalHostName() - Static method in class org.apache.spark.util.Utils
-
Get the local machine's FQDN.
- localCheckpoint() - Method in class org.apache.spark.rdd.RDD
-
Mark this RDD for local checkpointing using Spark's existing caching layer.
- localCheckpoint() - Method in class org.apache.spark.sql.Dataset
-
Eagerly locally checkpoints a Dataset and return the new Dataset.
- localCheckpoint(boolean) - Method in class org.apache.spark.sql.Dataset
-
Locally checkpoints a Dataset and return the new Dataset.
- locale() - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
Locale of the input for case insensitive matching.
- localHostName() - Static method in class org.apache.spark.util.Utils
-
Get the local machine's hostname.
- localHostNameForURI() - Static method in class org.apache.spark.util.Utils
-
Get the local machine's URI.
- LOCALITY() - Static method in class org.apache.spark.status.TaskIndexNames
-
- localityAwareTasks() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
-
- localitySummary() - Method in class org.apache.spark.status.LiveStage
-
- LocalKMeans - Class in org.apache.spark.mllib.clustering
-
An utility object to run K-means locally.
- LocalKMeans() - Constructor for class org.apache.spark.mllib.clustering.LocalKMeans
-
- LocalLDAModel - Class in org.apache.spark.ml.clustering
-
Local (non-distributed) model fitted by
LDA
.
- LocalLDAModel - Class in org.apache.spark.mllib.clustering
-
Local LDA model.
- localSeqToDatasetHolder(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLImplicits
-
Creates a
Dataset
from a local Seq.
- localSparkRPackagePath() - Static method in class org.apache.spark.api.r.RUtils
-
Get the SparkR package path in the local spark distribution.
- localValue() - Method in class org.apache.spark.Accumulable
-
Deprecated.
Get the current value of this accumulator from within a task.
- locate(String, Column) - Static method in class org.apache.spark.sql.functions
-
Locate the position of the first occurrence of substr.
- locate(String, Column, int) - Static method in class org.apache.spark.sql.functions
-
Locate the position of the first occurrence of substr in a string column, after position pos.
- location() - Method in interface org.apache.spark.scheduler.MapStatus
-
Location where this task was run.
- location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- location() - Method in class org.apache.spark.ui.storage.ExecutorStreamSummary
-
- locations() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockLocationsAndStatus
-
- locationUri() - Method in class org.apache.spark.sql.catalog.Database
-
- log() - Method in interface org.apache.spark.internal.Logging
-
- log(Function0<Parsers.Parser<T>>, String) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- log(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the natural logarithm of the given value.
- log(String) - Static method in class org.apache.spark.sql.functions
-
Computes the natural logarithm of the given column.
- log(double, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the first argument-base logarithm of the second argument.
- log(double, String) - Static method in class org.apache.spark.sql.functions
-
Returns the first argument-base logarithm of the second argument.
- Log$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$
-
- log10(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the logarithm of the given value in base 10.
- log10(String) - Static method in class org.apache.spark.sql.functions
-
Computes the logarithm of the given value in base 10.
- log1p(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the natural logarithm of the given value plus one.
- log1p(String) - Static method in class org.apache.spark.sql.functions
-
Computes the natural logarithm of the given column plus one.
- log2(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the logarithm of the given column in base 2.
- log2(String) - Static method in class org.apache.spark.sql.functions
-
Computes the logarithm of the given value in base 2.
- log_() - Method in interface org.apache.spark.internal.Logging
-
- logDebug(Function0<String>) - Method in interface org.apache.spark.internal.Logging
-
- logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
-
- logDeprecationWarning(String) - Static method in class org.apache.spark.SparkConf
-
Logs a warning message if the given config key is deprecated.
- logError(Function0<String>) - Method in interface org.apache.spark.internal.Logging
-
- logError(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
-
- logEvent() - Method in interface org.apache.spark.scheduler.SparkListenerEvent
-
- Logging - Interface in org.apache.spark.internal
-
Utility trait for classes that want to log data.
- logInfo(Function0<String>) - Method in interface org.apache.spark.internal.Logging
-
- logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
-
- LogisticGradient - Class in org.apache.spark.mllib.optimization
-
Developer API
Compute gradient and loss for a multinomial logistic loss function, as used
in multi-class classification (it is also used in binary logistic regression).
- LogisticGradient(int) - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
-
- LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
-
- LogisticRegression - Class in org.apache.spark.ml.classification
-
Logistic regression.
- LogisticRegression(String) - Constructor for class org.apache.spark.ml.classification.LogisticRegression
-
- LogisticRegression() - Constructor for class org.apache.spark.ml.classification.LogisticRegression
-
- LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util
-
Developer API
Generate test data for LogisticRegression.
- LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
- LogisticRegressionModel - Class in org.apache.spark.ml.classification
-
- LogisticRegressionModel - Class in org.apache.spark.mllib.classification
-
Classification model trained using Multinomial/Binary Logistic Regression.
- LogisticRegressionModel(Vector, double, int, int) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- LogisticRegressionParams - Interface in org.apache.spark.ml.classification
-
Params for logistic regression.
- LogisticRegressionSummary - Interface in org.apache.spark.ml.classification
-
Experimental
Abstraction for logistic regression results for a given model.
- LogisticRegressionSummaryImpl - Class in org.apache.spark.ml.classification
-
Multiclass logistic regression results for a given model.
- LogisticRegressionSummaryImpl(Dataset<Row>, String, String, String, String) - Constructor for class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
-
- LogisticRegressionTrainingSummary - Interface in org.apache.spark.ml.classification
-
Experimental
Abstraction for multiclass logistic regression training results.
- LogisticRegressionTrainingSummaryImpl - Class in org.apache.spark.ml.classification
-
Multiclass logistic regression training results.
- LogisticRegressionTrainingSummaryImpl(Dataset<Row>, String, String, String, String, double[]) - Constructor for class org.apache.spark.ml.classification.LogisticRegressionTrainingSummaryImpl
-
- LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification
-
Train a classification model for Multinomial/Binary Logistic Regression using
Limited-memory BFGS.
- LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-
- LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
-
Train a classification model for Binary Logistic Regression
using Stochastic Gradient Descent.
- LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
- Logit$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Logit$
-
- logLikelihood() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
-
- logLikelihood() - Method in class org.apache.spark.ml.clustering.GaussianMixtureSummary
-
- logLikelihood(Dataset<?>) - Method in class org.apache.spark.ml.clustering.LDAModel
-
Calculates a lower bound on the log likelihood of the entire corpus.
- logLikelihood() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Log likelihood of the observed tokens in the training set,
given the current parameter estimates:
log P(docs | topics, topic distributions for docs, alpha, eta)
- logLikelihood() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- logLikelihood(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
Calculates a lower bound on the log likelihood of the entire corpus.
- logLikelihood(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
Java-friendly version of logLikelihood
- LogLoss - Class in org.apache.spark.mllib.tree.loss
-
Developer API
Class for log loss calculation (for classification).
- LogLoss() - Constructor for class org.apache.spark.mllib.tree.loss.LogLoss
-
- logName() - Method in interface org.apache.spark.internal.Logging
-
- LogNormalGenerator - Class in org.apache.spark.mllib.random
-
Developer API
Generates i.i.d.
- LogNormalGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.LogNormalGenerator
-
- logNormalGraph(SparkContext, int, int, double, double, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Generate a graph whose vertex out degree distribution is log normal.
- logNormalJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Java-friendly version of RandomRDDs.logNormalRDD
.
- logNormalJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.logNormalJavaRDD
with the default seed.
- logNormalJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.logNormalJavaRDD
with the default number of partitions and the default seed.
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Java-friendly version of RandomRDDs.logNormalVectorRDD
.
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.logNormalJavaVectorRDD
with the default seed.
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.logNormalJavaVectorRDD
with the default number of partitions and
the default seed.
- logNormalRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the log normal distribution with the input
mean and standard deviation
- logNormalVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from a
log normal distribution.
- logpdf(Vector) - Method in class org.apache.spark.ml.stat.distribution.MultivariateGaussian
-
Returns the log-density of this multivariate Gaussian at given point, x
- logpdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
Returns the log-density of this multivariate Gaussian at given point, x
- logPerplexity(Dataset<?>) - Method in class org.apache.spark.ml.clustering.LDAModel
-
Calculate an upper bound on perplexity.
- logPerplexity(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
Calculate an upper bound on perplexity.
- logPerplexity(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
Java-friendly version of logPerplexity
- logPrior() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
-
Log probability of the current parameter estimate:
log P(topics, topic distributions for docs | Dirichlet hyperparameters)
- logPrior() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Log probability of the current parameter estimate:
log P(topics, topic distributions for docs | alpha, eta)
- logStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- logStartToJson(SparkListenerLogStart) - Static method in class org.apache.spark.util.JsonProtocol
-
- logTrace(Function0<String>) - Method in interface org.apache.spark.internal.Logging
-
- logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
-
- logTuningParams(org.apache.spark.ml.util.Instrumentation) - Method in interface org.apache.spark.ml.tuning.ValidatorParams
-
Instrumentation logging for tuning params including the inner estimator and evaluator info.
- logUncaughtExceptions(Function0<T>) - Static method in class org.apache.spark.util.Utils
-
Execute the given block, logging and re-throwing any uncaught exception.
- logUrlMap() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-
- logUrls() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
-
- logWarning(Function0<String>) - Method in interface org.apache.spark.internal.Logging
-
- logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
-
- LONG() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for nullable long type.
- longAccumulator() - Method in class org.apache.spark.SparkContext
-
Create and register a long accumulator, which starts with 0 and accumulates inputs by add
.
- longAccumulator(String) - Method in class org.apache.spark.SparkContext
-
Create and register a long accumulator, which starts with 0 and accumulates inputs by add
.
- LongAccumulator - Class in org.apache.spark.util
-
An
accumulator
for computing sum, count, and average of 64-bit integers.
- LongAccumulator() - Constructor for class org.apache.spark.util.LongAccumulator
-
- LongAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-
Deprecated.
- LongParam - Class in org.apache.spark.ml.param
-
Developer API
Specialized version of Param[Long]
for Java.
- LongParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
-
- LongParam(String, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
-
- LongParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
-
- LongParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
-
- LongType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the LongType object.
- LongType - Class in org.apache.spark.sql.types
-
The data type representing Long
values.
- LongType() - Constructor for class org.apache.spark.sql.types.LongType
-
- lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the list of values in the RDD for key key
.
- lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return the list of values in the RDD for key key
.
- lookupRpcTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
-
Returns the default Spark timeout to use for RPC remote endpoint lookup.
- loss(DenseMatrix<Object>, DenseMatrix<Object>, DenseMatrix<Object>) - Method in interface org.apache.spark.ml.ann.LossFunction
-
Returns the value of loss function.
- loss() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator
-
The current loss value of this aggregator.
- loss() - Method in interface org.apache.spark.ml.param.shared.HasLoss
-
Param for the loss function to be optimized.
- loss() - Method in class org.apache.spark.ml.regression.AFTAggregator
-
- loss() - Method in interface org.apache.spark.ml.regression.LinearRegressionParams
-
The loss function to be optimized.
- loss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- Loss - Interface in org.apache.spark.mllib.tree.loss
-
Developer API
Trait for adding "pluggable" loss functions for the gradient boosting algorithm.
- Losses - Class in org.apache.spark.mllib.tree.loss
-
- Losses() - Constructor for class org.apache.spark.mllib.tree.loss.Losses
-
- LossFunction - Interface in org.apache.spark.ml.ann
-
Trait for loss function
- LossReasonPending - Class in org.apache.spark.scheduler
-
A loss reason that means we don't yet know why the executor exited.
- LossReasonPending() - Constructor for class org.apache.spark.scheduler.LossReasonPending
-
- lossSum() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator
-
- lossType() - Method in interface org.apache.spark.ml.tree.GBTClassifierParams
-
Loss function which GBT tries to minimize.
- lossType() - Method in interface org.apache.spark.ml.tree.GBTRegressorParams
-
Loss function which GBT tries to minimize.
- LOST() - Static method in class org.apache.spark.TaskState
-
- low() - Method in class org.apache.spark.partial.BoundedDouble
-
- lower(Column) - Static method in class org.apache.spark.sql.functions
-
Converts a string column to lower case.
- lowerBoundsOnCoefficients() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
The lower bounds on coefficients if fitting under bound constrained optimization.
- lowerBoundsOnIntercepts() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
The lower bounds on intercepts if fitting under bound constrained optimization.
- LowPrioritySQLImplicits - Interface in org.apache.spark.sql
-
Lower priority implicit methods for converting Scala objects into
Dataset
s.
- lpad(Column, int, String) - Static method in class org.apache.spark.sql.functions
-
Left-pad the string column with pad to a length of len.
- LSHParams - Interface in org.apache.spark.ml.feature
-
Params for LSH
.
- lt(double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check if value is less than upperBound
- lt(Object) - Method in class org.apache.spark.sql.Column
-
Less than.
- ltEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check if value is less than or equal to upperBound
- ltrim(Column) - Static method in class org.apache.spark.sql.functions
-
Trim the spaces from left end for the specified string value.
- ltrim(Column, String) - Static method in class org.apache.spark.sql.functions
-
Trim the specified character string from left end for the specified string column.
- LZ4CompressionCodec - Class in org.apache.spark.io
-
- LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
-
- LZFCompressionCodec - Class in org.apache.spark.io
-
- LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec
-
- main(String[]) - Static method in class org.apache.spark.ml.param.shared.SharedParamsCodeGen
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.streaming.util.RawTextSender
-
- main(String[]) - Static method in class org.apache.spark.ui.UIWorkloadGenerator
-
- main(String[]) - Method in interface org.apache.spark.util.CommandLineUtils
-
- majorMinorVersion(String) - Static method in class org.apache.spark.util.VersionUtils
-
Given a Spark version string, return the (major version number, minor version number).
- majorVersion(String) - Static method in class org.apache.spark.util.VersionUtils
-
Given a Spark version string, return the major version number.
- makeBinarySearch(Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.util.CollectionsUtils
-
- makeDescription(String, String, boolean) - Static method in class org.apache.spark.ui.UIUtils
-
Returns HTML rendering of a job or stage description.
- makeDriverRef(String, SparkConf, org.apache.spark.rpc.RpcEnv) - Static method in class org.apache.spark.util.RpcUtils
-
Retrieve a RpcEndpointRef
which is located in the driver via its name.
- makeHref(boolean, String, String) - Static method in class org.apache.spark.ui.UIUtils
-
Return the correct Href after checking if master is running in the
reverse proxy mode or not.
- makeProgressBar(int, int, int, int, Map<String, Object>, int) - Static method in class org.apache.spark.ui.UIUtils
-
- makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD.
- makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD, with one or more
location preferences (hostnames of Spark nodes) for each object.
- makeRDDForPartitionedTable(Seq<Partition>) - Method in interface org.apache.spark.sql.hive.TableReader
-
- makeRDDForTable(Table) - Method in interface org.apache.spark.sql.hive.TableReader
-
- map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- map(Function1<Object, Object>) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Map the values of this matrix using a function.
- map(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Map the values of this matrix using a function.
- map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult
-
Transform this PartialResult into a PartialResult of type T.
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to all elements of this RDD.
- map(DataType, DataType) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type map.
- map(MapType) - Method in class org.apache.spark.sql.ColumnName
-
- map(Function1<T, U>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Scala-specific)
Returns a new Dataset that contains the result of applying func
to each element.
- map(MapFunction<T, U>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Java-specific)
Returns a new Dataset that contains the result of applying func
to each element.
- map(Column...) - Static method in class org.apache.spark.sql.functions
-
Creates a new map column.
- map(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Creates a new map column.
- map(Function<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream.
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by applying a function to all elements of this DStream.
- map_concat(Column...) - Static method in class org.apache.spark.sql.functions
-
Returns the union of all the given maps.
- map_concat(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Returns the union of all the given maps.
- map_from_arrays(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Creates a new map column.
- map_from_entries(Column) - Static method in class org.apache.spark.sql.functions
-
Returns a map created from the given array of entries.
- map_keys(Column) - Static method in class org.apache.spark.sql.functions
-
Returns an unordered array containing the keys of the map.
- map_values(Column) - Static method in class org.apache.spark.sql.functions
-
Returns an unordered array containing the values of the map.
- mapAsSerializableJavaMap(Map<A, B>) - Static method in class org.apache.spark.api.java.JavaUtils
-
- mapEdgePartitions(Function2<Object, EdgePartition<ED, VD>, EdgePartition<ED2, VD2>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- mapEdges(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute in the graph using the map function.
- mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute using the map function, passing it a whole partition at a
time.
- mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- mapFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
-------------------------------- *
Util JSON deserialization methods |
- MapFunction<T,U> - Interface in org.apache.spark.api.java.function
-
Base interface for a map function used in Dataset's map function.
- mapGroups(Function2<K, Iterator<V>, U>, Encoder<U>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
(Scala-specific)
Applies the given function to each group of data.
- mapGroups(MapGroupsFunction<K, V, U>, Encoder<U>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
(Java-specific)
Applies the given function to each group of data.
- MapGroupsFunction<K,V,R> - Interface in org.apache.spark.api.java.function
-
Base interface for a map function used in GroupedDataset's mapGroup function.
- mapGroupsWithState(Function3<K, Iterator<V>, GroupState<S>, U>, Encoder<S>, Encoder<U>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
::Experimental::
(Scala-specific)
Applies the given function to each group of data, while maintaining a user-defined per-group
state.
- mapGroupsWithState(GroupStateTimeout, Function3<K, Iterator<V>, GroupState<S>, U>, Encoder<S>, Encoder<U>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
::Experimental::
(Scala-specific)
Applies the given function to each group of data, while maintaining a user-defined per-group
state.
- mapGroupsWithState(MapGroupsWithStateFunction<K, V, S, U>, Encoder<S>, Encoder<U>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
::Experimental::
(Java-specific)
Applies the given function to each group of data, while maintaining a user-defined per-group
state.
- mapGroupsWithState(MapGroupsWithStateFunction<K, V, S, U>, Encoder<S>, Encoder<U>, GroupStateTimeout) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
::Experimental::
(Java-specific)
Applies the given function to each group of data, while maintaining a user-defined per-group
state.
- MapGroupsWithStateFunction<K,V,S,R> - Interface in org.apache.spark.api.java.function
-
- mapId() - Method in class org.apache.spark.FetchFailed
-
- mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- mapId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-
- mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- mapOutputTracker() - Method in class org.apache.spark.SparkEnv
-
- MapOutputTrackerMessage - Interface in org.apache.spark
-
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(Function1<Iterator<T>, Iterator<S>>, boolean, ClassTag<S>) - Method in class org.apache.spark.rdd.RDDBarrier
-
Experimental
Returns a new RDD by applying a function to each partition of the wrapped RDD,
where tasks are launched together in a barrier stage.
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Scala-specific)
Returns a new Dataset that contains the result of applying func
to each partition.
- mapPartitions(MapPartitionsFunction<T, U>, Encoder<U>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Java-specific)
Returns a new Dataset that contains the result of applying f
to each partition.
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- MapPartitionsFunction<T,U> - Interface in org.apache.spark.api.java.function
-
Base interface for function used in Dataset's mapPartitions.
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.HadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
-
- MapStatus - Interface in org.apache.spark.scheduler
-
Result returned by a ShuffleMapTask to a scheduler.
- mapStatuses() - Method in class org.apache.spark.ShuffleStatus
-
MapStatus for each partition.
- mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- mapToJson(Map<String, String>) - Static method in class org.apache.spark.util.JsonProtocol
-
------------------------------ *
Util JSON serialization methods |
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream.
- mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute using the map function, passing it the adjacent vertex
attributes as well.
- mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute using the map function, passing it the adjacent vertex
attributes as well.
- mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute a partition at a time using the map function, passing it the
adjacent vertex attributes as well.
- mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- MapType - Class in org.apache.spark.sql.types
-
The data type for Maps.
- MapType(DataType, DataType, boolean) - Constructor for class org.apache.spark.sql.types.MapType
-
- MapType() - Constructor for class org.apache.spark.sql.types.MapType
-
No-arg constructor for kryo.
- mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Pass each value in the key-value pair RDD through a map function without changing the keys;
this also retains the original RDD's partitioning.
- mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.EdgeRDD
-
Map the values in an edge partitioning preserving the structure but changing the values.
- mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Maps each vertex attribute, preserving the index.
- mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Maps each vertex attribute, additionally supplying the vertex ID.
- mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Pass each value in the key-value pair RDD through a map function without changing the keys;
this also retains the original RDD's partitioning.
- mapValues(Function1<V, W>, Encoder<W>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
- mapValues(MapFunction<V, W>, Encoder<W>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
- mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying a map function to the value of each key-value pairs in
'this' DStream without changing the key.
- mapValues(Function1<V, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying a map function to the value of each key-value pairs in
'this' DStream without changing the key.
- mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each vertex attribute in the graph using the map function.
- mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- mapWithState(StateSpec<K, V, StateType, MappedType>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Experimental
Return a
JavaMapWithStateDStream
by applying a function to every key-value element of
this
stream, while maintaining some state data for each unique key.
- mapWithState(StateSpec<K, V, StateType, MappedType>, ClassTag<StateType>, ClassTag<MappedType>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Experimental
Return a
MapWithStateDStream
by applying a function to every key-value element of
this
stream, while maintaining some state data for each unique key.
- MapWithStateDStream<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming.dstream
-
Experimental
DStream representing the stream of data generated by
mapWithState
operation on a
pair DStream
.
- MapWithStateDStream(StreamingContext, ClassTag<MappedType>) - Constructor for class org.apache.spark.streaming.dstream.MapWithStateDStream
-
- mark(int) - Method in class org.apache.spark.storage.BufferReleasingInputStream
-
- markSupported() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-
- mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Restricts the graph to only the vertices and edges that are also in other
, but keeps the
attributes from this graph.
- mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- master() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- master() - Method in class org.apache.spark.SparkContext
-
- master(String) - Method in class org.apache.spark.sql.SparkSession.Builder
-
Sets the Spark master URL to connect to, such as "local" to run locally, "local[4]" to
run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
- Matrices - Class in org.apache.spark.ml.linalg
-
- Matrices() - Constructor for class org.apache.spark.ml.linalg.Matrices
-
- Matrices - Class in org.apache.spark.mllib.linalg
-
- Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
-
- Matrix - Interface in org.apache.spark.ml.linalg
-
Trait for a local matrix.
- Matrix - Interface in org.apache.spark.mllib.linalg
-
Trait for a local matrix.
- MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed
-
Represents an entry in a distributed matrix.
- MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
-
- MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation
-
Model representing the result of matrix factorization.
- MatrixFactorizationModel(int, RDD<Tuple2<Object, double[]>>, RDD<Tuple2<Object, double[]>>) - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- MatrixFactorizationModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.recommendation
-
- MatrixImplicits - Class in org.apache.spark.mllib.linalg
-
Implicit methods available in Scala for converting
Matrix
to
Matrix
and vice versa.
- MatrixImplicits() - Constructor for class org.apache.spark.mllib.linalg.MatrixImplicits
-
- MatrixType() - Static method in class org.apache.spark.ml.linalg.SQLDataTypes
-
- max() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Returns the maximum element from this RDD as defined by
the default comparator natural order.
- max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the maximum element from this RDD as defined by the specified
Comparator[T].
- MAX() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
-
- max() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- max() - Method in interface org.apache.spark.ml.feature.MinMaxScalerParams
-
upper bound after transformation, shared by all features
Default: 1.0
- max(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
-
- max(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
-
- max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Maximum value of each dimension.
- max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Maximum value of each column.
- max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the max of this RDD as defined by the implicit Ordering[T].
- max(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the maximum value of the expression in a group.
- max(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the maximum value of the column in a group.
- max(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the max value for each numeric columns for each group.
- max(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the max value for each numeric columns for each group.
- max(Duration) - Method in class org.apache.spark.streaming.Duration
-
- max(Time) - Method in class org.apache.spark.streaming.Time
-
- max(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
- max() - Method in class org.apache.spark.util.StatCounter
-
- MAX_FEATURES_FOR_NORMAL_SOLVER() - Static method in class org.apache.spark.ml.regression.LinearRegression
-
When using LinearRegression.solver
== "normal", the solver must limit the number of
features to at most this number.
- MAX_INT_DIGITS() - Static method in class org.apache.spark.sql.types.Decimal
-
Maximum number of decimal digits an Int can represent
- MAX_LONG_DIGITS() - Static method in class org.apache.spark.sql.types.Decimal
-
Maximum number of decimal digits a Long can represent
- MAX_PRECISION() - Static method in class org.apache.spark.sql.types.DecimalType
-
- MAX_RETAINED_DEAD_EXECUTORS() - Static method in class org.apache.spark.status.config
-
- MAX_RETAINED_JOBS() - Static method in class org.apache.spark.status.config
-
- MAX_RETAINED_ROOT_NODES() - Static method in class org.apache.spark.status.config
-
- MAX_RETAINED_STAGES() - Static method in class org.apache.spark.status.config
-
- MAX_RETAINED_TASKS_PER_STAGE() - Static method in class org.apache.spark.status.config
-
- MAX_SCALE() - Static method in class org.apache.spark.sql.types.DecimalType
-
- maxAbs() - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
-
- MaxAbsScaler - Class in org.apache.spark.ml.feature
-
Rescale each feature individually to range [-1, 1] by dividing through the largest maximum
absolute value in each feature.
- MaxAbsScaler(String) - Constructor for class org.apache.spark.ml.feature.MaxAbsScaler
-
- MaxAbsScaler() - Constructor for class org.apache.spark.ml.feature.MaxAbsScaler
-
- MaxAbsScalerModel - Class in org.apache.spark.ml.feature
-
- MaxAbsScalerParams - Interface in org.apache.spark.ml.feature
-
- maxBins() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
Maximum number of bins used for discretizing continuous features and for choosing how to split
on features at each node.
- maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxBufferSizeMb() - Method in class org.apache.spark.serializer.KryoSerializer
-
- maxCategories() - Method in interface org.apache.spark.ml.feature.VectorIndexerParams
-
Threshold for the number of values a categorical feature can take.
- maxCores() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-
- maxDepth() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
Maximum depth of the tree (nonnegative).
- maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxDF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
-
Specifies the maximum number of different documents a term could appear in to be included
in the vocabulary.
- maxId() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- maxId() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
-
- maxId() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- maxId() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- maxId() - Static method in class org.apache.spark.rdd.CheckpointState
-
- maxId() - Static method in class org.apache.spark.rdd.DeterministicLevel
-
- maxId() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- maxId() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- maxId() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
-
- maxId() - Static method in class org.apache.spark.TaskState
-
- maxIter() - Method in interface org.apache.spark.ml.param.shared.HasMaxIter
-
Param for maximum number of iterations (>= 0).
- maxIters() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- maxLocalProjDBSize() - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
Param for the maximum number of items (including delimiters used in the internal storage
format) allowed in a projected database before local processing (default: 32000000
).
- maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- maxMemory() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- maxMemory() - Method in class org.apache.spark.status.LiveExecutor
-
- maxMemoryInMB() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
Maximum memory in MB allocated to histogram aggregation.
- maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxMessageSizeBytes(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
-
Returns the configured max message size for messages in bytes.
- maxNodesInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the maximum number of nodes which can be in the given level of the tree.
- maxNumConcurrentTasks() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
Get the max number of tasks that can be concurrent launched currently.
- maxOffHeapMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- maxOffHeapMemSize() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
-
- maxOnHeapMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- maxOnHeapMemSize() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
-
- maxPatternLength() - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
Param for the maximal pattern length (default: 10
).
- maxPrecisionForBytes(int) - Static method in class org.apache.spark.sql.types.Decimal
-
- maxReplicas() - Method in class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock
-
- maxSentenceLength() - Method in interface org.apache.spark.ml.feature.Word2VecBase
-
Sets the maximum length (in words) of each sentence in the input data.
- maxSplitFeatureIndex() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel
-
Trace down the tree, and return the largest feature index used in any split.
- maxTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- maxTasks() - Method in class org.apache.spark.status.LiveExecutor
-
- maxVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- maybeUpdateOutputMetrics(OutputMetrics, Function0<Object>, long) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
-
- md5(Column) - Static method in class org.apache.spark.sql.functions
-
Calculates the MD5 digest of a binary column and returns the value
as a 32 character hex string.
- mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the mean of this RDD's elements.
- mean() - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- mean() - Method in class org.apache.spark.ml.stat.distribution.MultivariateGaussian
-
- mean(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
-
- mean(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
-
- mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- mean() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-
- mean() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Sample mean of each dimension.
- mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Sample mean vector.
- mean() - Method in class org.apache.spark.partial.BoundedDouble
-
- mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the mean of this RDD's elements.
- mean(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- mean(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- mean(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the average value for each numeric columns for each group.
- mean(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the average value for each numeric columns for each group.
- mean() - Method in class org.apache.spark.util.StatCounter
-
- meanAbsoluteError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
Returns the mean absolute error, which is a risk function corresponding to the
expected value of the absolute error loss or l1-norm loss.
- meanAbsoluteError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns the mean absolute error, which is a risk function corresponding to the
expected value of the absolute error loss or l1-norm loss.
- meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return the approximate mean of the elements in this RDD.
- meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Approximate operation to return the mean within a timeout.
- meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Approximate operation to return the mean within a timeout.
- meanAveragePrecision() - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
-
Returns the mean average precision (MAP) of all the queries.
- means() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
-
- means() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- meanSquaredError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
Returns the mean squared error, which is a risk function corresponding to the
expected value of the squared error loss or quadratic loss.
- meanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns the mean squared error, which is a risk function corresponding to the
expected value of the squared error loss or quadratic loss.
- megabytesToString(long) - Static method in class org.apache.spark.util.Utils
-
Convert a quantity in megabytes to a human-readable string such as "4.0 MB".
- MEM_SPILL() - Static method in class org.apache.spark.status.TaskIndexNames
-
- MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_BYTES_SPILLED() - Static method in class org.apache.spark.InternalAccumulator
-
- MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
-
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- memoryCost(int, int) - Static method in class org.apache.spark.mllib.feature.PCAUtil
-
- MemoryEntry<T> - Interface in org.apache.spark.storage.memory
-
- MemoryEntryBuilder<T> - Interface in org.apache.spark.storage.memory
-
- memoryManager() - Method in class org.apache.spark.SparkEnv
-
- memoryMetrics() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- MemoryMetrics - Class in org.apache.spark.status.api.v1
-
- memoryMode() - Method in class org.apache.spark.storage.memory.DeserializedMemoryEntry
-
- memoryMode() - Method in interface org.apache.spark.storage.memory.MemoryEntry
-
- memoryMode() - Method in class org.apache.spark.storage.memory.SerializedMemoryEntry
-
- MemoryParam - Class in org.apache.spark.util
-
An extractor object for parsing JVM memory strings, such as "10g", into an Int representing
the number of megabytes.
- MemoryParam() - Constructor for class org.apache.spark.util.MemoryParam
-
- memoryPerExecutorMB() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-
- memoryRemaining() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-
- memoryStringToMb(String) - Static method in class org.apache.spark.util.Utils
-
Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) to a number of mebibytes.
- memoryUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-
- memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
-
- memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- memoryUsed() - Method in class org.apache.spark.status.LiveExecutor
-
- memoryUsed() - Method in class org.apache.spark.status.LiveRDD
-
- memoryUsed() - Method in class org.apache.spark.status.LiveRDDDistribution
-
- memoryUsed() - Method in class org.apache.spark.status.LiveRDDPartition
-
- memoryUsedBytes() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress
-
- memSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- memSize() - Method in class org.apache.spark.storage.BlockStatus
-
- memSize() - Method in class org.apache.spark.storage.BlockUpdatedInfo
-
- memSize() - Method in class org.apache.spark.storage.RDDInfo
-
- merge(R) - Method in class org.apache.spark.Accumulable
-
Deprecated.
Merge two accumulable objects together
- merge(ExpectationAggregator) - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
-
Merge another ExpectationAggregator, update the weights, means and covariances
for each distributions, and update the log likelihood.
- merge(Agg) - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator
-
Merge two aggregators.
- merge(AFTAggregator) - Method in class org.apache.spark.ml.regression.AFTAggregator
-
Merge another AFTAggregator, and update the loss and gradient
of the objective function.
- merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Merges another.
- merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Merge another MultivariateOnlineSummarizer, and update the statistical summary.
- merge(int, U) - Method in interface org.apache.spark.partial.ApproximateEvaluator
-
- merge(BUF, BUF) - Method in class org.apache.spark.sql.expressions.Aggregator
-
Merge two intermediate values.
- merge(MutableAggregationBuffer, Row) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
-
Merges two aggregation buffers and stores the updated buffer values back to buffer1
.
- merge(AccumulatorV2<IN, OUT>) - Method in class org.apache.spark.util.AccumulatorV2
-
Merges another same-type accumulator into this one and update its state, i.e.
- merge(AccumulatorV2<T, List<T>>) - Method in class org.apache.spark.util.CollectionAccumulator
-
- merge(AccumulatorV2<Double, Double>) - Method in class org.apache.spark.util.DoubleAccumulator
-
- merge(AccumulatorV2<T, R>) - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
-
- merge(AccumulatorV2<Long, Long>) - Method in class org.apache.spark.util.LongAccumulator
-
- merge(double) - Method in class org.apache.spark.util.StatCounter
-
Add a value into this StatCounter, updating the internal statistics.
- merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter
-
Add multiple values into this StatCounter, updating the internal statistics.
- merge(StatCounter) - Method in class org.apache.spark.util.StatCounter
-
Merge another StatCounter into this one, adding up the internal statistics.
- mergeCombiners() - Method in class org.apache.spark.Aggregator
-
- mergeInPlace(BloomFilter) - Method in class org.apache.spark.util.sketch.BloomFilter
-
Combines this bloom filter with another bloom filter by performing a bitwise OR of the
underlying data.
- mergeInPlace(CountMinSketch) - Method in class org.apache.spark.util.sketch.CountMinSketch
-
- mergeOffsets(PartitionOffset[]) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader
-
- mergeValue() - Method in class org.apache.spark.Aggregator
-
- message() - Method in class org.apache.spark.FetchFailed
-
- message() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
-
- message() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker
-
- message() - Static method in class org.apache.spark.scheduler.ExecutorKilled
-
- message() - Static method in class org.apache.spark.scheduler.LossReasonPending
-
- message() - Method in exception org.apache.spark.sql.AnalysisException
-
- message() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
-
- message() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus
-
- MetaAlgorithmReadWrite - Class in org.apache.spark.ml.util
-
Default Meta-Algorithm read and write implementation.
- MetaAlgorithmReadWrite() - Constructor for class org.apache.spark.ml.util.MetaAlgorithmReadWrite
-
- Metadata - Class in org.apache.spark.sql.types
-
Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean,
Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], and
Array[Metadata].
- metadata() - Method in class org.apache.spark.sql.types.StructField
-
- metadata() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
-
- METADATA_KEY_DESCRIPTION() - Static method in class org.apache.spark.streaming.scheduler.StreamInputInfo
-
The key for description in StreamInputInfo.metadata
.
- MetadataBuilder - Class in org.apache.spark.sql.types
-
- MetadataBuilder() - Constructor for class org.apache.spark.sql.types.MetadataBuilder
-
- metadataDescription() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
-
- MetadataUtils - Class in org.apache.spark.ml.util
-
Helper utilities for algorithms using ML metadata
- MetadataUtils() - Constructor for class org.apache.spark.ml.util.MetadataUtils
-
- Method(String, Function2<Object, Object, Object>) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method
-
- method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- Method$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
-
- MethodIdentifier<T> - Class in org.apache.spark.util
-
Helper class to identify a method.
- MethodIdentifier(Class<T>, String, String) - Constructor for class org.apache.spark.util.MethodIdentifier
-
- methodName() - Method in interface org.apache.spark.mllib.stat.test.StreamingTestMethod
-
- methodName() - Static method in class org.apache.spark.mllib.stat.test.StudentTTest
-
- methodName() - Static method in class org.apache.spark.mllib.stat.test.WelchTTest
-
- METRIC_COMPILATION_TIME() - Static method in class org.apache.spark.metrics.source.CodegenMetrics
-
Histogram of the time it took to compile source code text (in milliseconds).
- METRIC_FILE_CACHE_HITS() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
Tracks the total number of files served from the file status cache instead of discovered.
- METRIC_FILES_DISCOVERED() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
Tracks the total number of files discovered off of the filesystem by InMemoryFileIndex.
- METRIC_GENERATED_CLASS_BYTECODE_SIZE() - Static method in class org.apache.spark.metrics.source.CodegenMetrics
-
Histogram of the bytecode size of each class generated by CodeGenerator.
- METRIC_GENERATED_METHOD_BYTECODE_SIZE() - Static method in class org.apache.spark.metrics.source.CodegenMetrics
-
Histogram of the bytecode size of each method in classes generated by CodeGenerator.
- METRIC_HIVE_CLIENT_CALLS() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
Tracks the total number of Hive client calls (e.g.
- METRIC_PARALLEL_LISTING_JOB_COUNT() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
Tracks the total number of Spark jobs launched for parallel file listing.
- METRIC_PARTITIONS_FETCHED() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
Tracks the total number of partition metadata entries fetched via the client api.
- METRIC_SOURCE_CODE_SIZE() - Static method in class org.apache.spark.metrics.source.CodegenMetrics
-
Histogram of the length of source code text compiled by CodeGenerator (in characters).
- metricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
param for metric name in evaluation (supports "areaUnderROC"
(default), "areaUnderPR"
)
- metricName() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
param for metric name in evaluation
(supports "silhouette"
(default))
- metricName() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
param for metric name in evaluation (supports "f1"
(default), "weightedPrecision"
,
"weightedRecall"
, "accuracy"
)
- metricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
Param for metric name in evaluation.
- metricRegistry() - Static method in class org.apache.spark.metrics.source.CodegenMetrics
-
- metricRegistry() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
- metricRegistry() - Method in interface org.apache.spark.metrics.source.Source
-
- metrics(String...) - Static method in class org.apache.spark.ml.stat.Summarizer
-
Given a list of metrics, provides a builder that it turns computes metrics from a column.
- metrics(Seq<String>) - Static method in class org.apache.spark.ml.stat.Summarizer
-
Given a list of metrics, provides a builder that it turns computes metrics from a column.
- metrics() - Method in class org.apache.spark.status.LiveExecutorStageSummary
-
- metrics() - Method in class org.apache.spark.status.LiveStage
-
- METRICS_PREFIX() - Static method in class org.apache.spark.InternalAccumulator
-
- metricsSystem() - Method in class org.apache.spark.SparkEnv
-
- MFDataGenerator - Class in org.apache.spark.mllib.util
-
Developer API
Generate RDD(s) containing data for Matrix Factorization.
- MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
-
- MicroBatchReader - Interface in org.apache.spark.sql.sources.v2.reader.streaming
-
- MicroBatchReadSupport - Interface in org.apache.spark.sql.sources.v2
-
- microF1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns micro-averaged label-based f1-measure
(equals to micro-averaged document-based f1-measure)
- microPrecision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns micro-averaged label-based precision
(equals to micro-averaged document-based precision)
- microRecall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns micro-averaged label-based recall
(equals to micro-averaged document-based recall)
- mightContain(Object) - Method in class org.apache.spark.util.sketch.BloomFilter
-
Returns true
if the element might have been put in this Bloom filter,
false
if this is definitely not the case.
- mightContainBinary(byte[]) - Method in class org.apache.spark.util.sketch.BloomFilter
-
- mightContainLong(long) - Method in class org.apache.spark.util.sketch.BloomFilter
-
- mightContainString(String) - Method in class org.apache.spark.util.sketch.BloomFilter
-
- milliseconds() - Method in class org.apache.spark.streaming.Duration
-
- milliseconds(long) - Static method in class org.apache.spark.streaming.Durations
-
- Milliseconds - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of milliseconds.
- Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
-
- milliseconds() - Method in class org.apache.spark.streaming.Time
-
- millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
Reformat a time interval in milliseconds to a prettier format for output
- min() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Returns the minimum element from this RDD as defined by
the default comparator natural order.
- min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the minimum element from this RDD as defined by the specified
Comparator[T].
- MIN() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
-
- min() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- min() - Method in interface org.apache.spark.ml.feature.MinMaxScalerParams
-
lower bound after transformation, shared by all features
Default: 0.0
- min(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
-
- min(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
-
- min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Minimum value of each dimension.
- min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Minimum value of each column.
- min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the min of this RDD as defined by the implicit Ordering[T].
- min(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the minimum value of the expression in a group.
- min(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the minimum value of the column in a group.
- min(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the min value for each numeric column for each group.
- min(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the min value for each numeric column for each group.
- min(Duration) - Method in class org.apache.spark.streaming.Duration
-
- min(Time) - Method in class org.apache.spark.streaming.Time
-
- min() - Method in class org.apache.spark.util.StatCounter
-
- minBytesForPrecision() - Static method in class org.apache.spark.sql.types.Decimal
-
- minConfidence() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
-
Minimal confidence for generating Association Rule.
- minCount() - Method in interface org.apache.spark.ml.feature.Word2VecBase
-
The minimum number of times a token must appear to be included in the word2vec model's
vocabulary.
- minDF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
-
Specifies the minimum number of different documents a term must appear in to be included
in the vocabulary.
- minDivisibleClusterSize() - Method in interface org.apache.spark.ml.clustering.BisectingKMeansParams
-
The minimum number of points (if greater than or equal to 1.0) or the minimum proportion
of points (if less than 1.0) of a divisible cluster (default: 1.0).
- minDocFreq() - Method in interface org.apache.spark.ml.feature.IDFBase
-
The minimum number of documents in which a term should appear.
- minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
- minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF
-
- MinHashLSH - Class in org.apache.spark.ml.feature
-
Experimental
- MinHashLSH(String) - Constructor for class org.apache.spark.ml.feature.MinHashLSH
-
- MinHashLSH() - Constructor for class org.apache.spark.ml.feature.MinHashLSH
-
- MinHashLSHModel - Class in org.apache.spark.ml.feature
-
Experimental
- MINIMUM_ADJUSTED_SCALE() - Static method in class org.apache.spark.sql.types.DecimalType
-
- minInfoGain() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
Minimum information gain for a split to be considered at a tree node.
- minInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- minInstancesPerNode() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
Minimum number of instances each child must have after split.
- minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- MinMaxScaler - Class in org.apache.spark.ml.feature
-
Rescale each feature individually to a common range [min, max] linearly using column summary
statistics, which is also known as min-max normalization or Rescaling.
- MinMaxScaler(String) - Constructor for class org.apache.spark.ml.feature.MinMaxScaler
-
- MinMaxScaler() - Constructor for class org.apache.spark.ml.feature.MinMaxScaler
-
- MinMaxScalerModel - Class in org.apache.spark.ml.feature
-
- MinMaxScalerParams - Interface in org.apache.spark.ml.feature
-
- minorVersion(String) - Static method in class org.apache.spark.util.VersionUtils
-
Given a Spark version string, return the minor version number.
- minSamplingRate() - Static method in class org.apache.spark.util.random.BinomialBounds
-
- minShare() - Method in interface org.apache.spark.scheduler.Schedulable
-
- minSupport() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
-
Minimal support level of the frequent pattern.
- minSupport() - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
Param for the minimal support level (default: 0.1
).
- minTF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
-
Filter to ignore rare words in a document.
- minTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
Minimum token length, greater than or equal to 0.
- minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD
-
For each VertexId present in both this
and other
, minus will act as a set difference
operation returning only those unique VertexId's present in this
.
- minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
-
For each VertexId present in both this
and other
, minus will act as a set difference
operation returning only those unique VertexId's present in this
.
- minus(Object) - Method in class org.apache.spark.sql.Column
-
Subtraction.
- minus(Decimal, Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
-
- minus(Duration) - Method in class org.apache.spark.streaming.Duration
-
- minus(Time) - Method in class org.apache.spark.streaming.Time
-
- minus(Duration) - Method in class org.apache.spark.streaming.Time
-
- minute(Column) - Static method in class org.apache.spark.sql.functions
-
Extracts the minutes as an integer from a given date/timestamp/string.
- minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- minutes(long) - Static method in class org.apache.spark.streaming.Durations
-
- Minutes - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of minutes.
- Minutes() - Constructor for class org.apache.spark.streaming.Minutes
-
- minVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- missingValue() - Method in interface org.apache.spark.ml.feature.ImputerParams
-
The placeholder for the missing values.
- mkList() - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- mkString() - Method in interface org.apache.spark.sql.Row
-
Displays all elements of this sequence in a string (without a separator).
- mkString(String) - Method in interface org.apache.spark.sql.Row
-
Displays all elements of this sequence in a string using a separator string.
- mkString(String, String, String) - Method in interface org.apache.spark.sql.Row
-
Displays all elements of this traversable or iterator in a string using
start, end, and separator strings.
- mkString(String, String, String) - Method in class org.apache.spark.status.api.v1.StackTrace
-
- ML_ATTR() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
-
- mlDenseMatrixToMLlibDenseMatrix(DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
-
- mlDenseVectorToMLlibDenseVector(DenseVector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
-
- MLFormatRegister - Interface in org.apache.spark.ml.util
-
ML export formats for should implement this trait so that users can specify a shortname rather
than the fully qualified class name of the exporter.
- mllibDenseMatrixToMLDenseMatrix(DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
-
- mllibDenseVectorToMLDenseVector(DenseVector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
-
- mllibMatrixToMLMatrix(Matrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
-
- mllibSparseMatrixToMLSparseMatrix(SparseMatrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
-
- mllibSparseVectorToMLSparseVector(SparseVector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
-
- mllibVectorToMLVector(Vector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
-
- mlMatrixToMLlibMatrix(Matrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
-
- MLPairRDDFunctions<K,V> - Class in org.apache.spark.mllib.rdd
-
Developer API
Machine learning specific Pair RDD functions.
- MLPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.mllib.rdd.MLPairRDDFunctions
-
- MLReadable<T> - Interface in org.apache.spark.ml.util
-
Trait for objects that provide MLReader
.
- MLReader<T> - Class in org.apache.spark.ml.util
-
Abstract class for utility classes that can load ML instances.
- MLReader() - Constructor for class org.apache.spark.ml.util.MLReader
-
- mlSparseMatrixToMLlibSparseMatrix(SparseMatrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
-
- mlSparseVectorToMLlibSparseVector(SparseVector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
-
- MLUtils - Class in org.apache.spark.mllib.util
-
Helper methods to load, save and pre-process data used in MLLib.
- MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
-
- mlVectorToMLlibVector(Vector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
-
- MLWritable - Interface in org.apache.spark.ml.util
-
Trait for classes that provide MLWriter
.
- MLWriter - Class in org.apache.spark.ml.util
-
Abstract class for utility classes that can save ML instances in Spark's internal format.
- MLWriter() - Constructor for class org.apache.spark.ml.util.MLWriter
-
- MLWriterFormat - Interface in org.apache.spark.ml.util
-
Abstract class to be implemented by objects that provide ML exportability.
- mod(Object) - Method in class org.apache.spark.sql.Column
-
Modulo (a.k.a.
- mode(SaveMode) - Method in class org.apache.spark.sql.DataFrameWriter
-
Specifies the behavior when data or table already exists.
- mode(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Specifies the behavior when data or table already exists.
- mode() - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
-
- model(Vector) - Method in interface org.apache.spark.ml.ann.Topology
-
- model(long) - Method in interface org.apache.spark.ml.ann.Topology
-
- Model<M extends Model<M>> - Class in org.apache.spark.ml
-
- Model() - Constructor for class org.apache.spark.ml.Model
-
- models() - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- modelType() - Method in interface org.apache.spark.ml.classification.NaiveBayesParams
-
The model type which is a string (case-sensitive).
- modelType() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- modelType() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
-
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-
Deprecated.
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-
Deprecated.
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-
Deprecated.
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-
Deprecated.
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.StringAccumulatorParam$
-
Deprecated.
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.internal.io.FileCommitProtocol.EmptyTaskCommitMessage$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.InternalAccumulator.input$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.InternalAccumulator.output$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.InternalAccumulator.shuffleRead$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.InternalAccumulator.shuffleWrite$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.feature.Word2VecModel.Word2VecModelWriter$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.Pipeline.SharedReadWrite$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.InBlock$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.Rating$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.RatingBlock$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Family$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.FamilyAndLink$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Link$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Logit$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Probit$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Tweedie$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.tree.EnsembleModelReadWrite.EnsembleNodeData$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV2_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV2_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.fpm.PrefixSpan.Postfix$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.fpm.PrefixSpan.Prefix$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest.NullHypothesis$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.GetExecutorLossReason$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutorsOnHost$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveLastAllocatedExecutorId$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkAppConfig$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SetupDriver$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.Shutdown$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveShim.HiveFunctionWrapper$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveStrategies.Scripts$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.RelationalGroupedDataset.CubeType$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.RelationalGroupedDataset.GroupByType$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.RelationalGroupedDataset.PivotType$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.RelationalGroupedDataset.RollupType$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.types.Decimal.DecimalAsIfIntegral$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.types.Decimal.DecimalIsFractional$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.types.DecimalType.Expression$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.types.DecimalType.Fixed$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.BlockLocationsAndStatus$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetExecutorEndpointRef$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocations$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocationsAndStatus$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetPeers$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.HasCachedBlocks$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.TriggerThreadDump$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ui.JettyUtils.ServletParams$
-
Static reference to the singleton instance of this Scala object.
- monotonically_increasing_id() - Static method in class org.apache.spark.sql.functions
-
A column expression that generates monotonically increasing 64-bit integers.
- monotonicallyIncreasingId() - Static method in class org.apache.spark.sql.functions
-
- month(Column) - Static method in class org.apache.spark.sql.functions
-
Extracts the month as an integer from a given date/timestamp/string.
- months_between(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns number of months between dates start
and end
.
- months_between(Column, Column, boolean) - Static method in class org.apache.spark.sql.functions
-
Returns number of months between dates end
and start
.
- msDurationToString(long) - Static method in class org.apache.spark.util.Utils
-
Returns a human-readable string representing a duration such as "35ms"
- MsSqlServerDialect - Class in org.apache.spark.sql.jdbc
-
- MsSqlServerDialect() - Constructor for class org.apache.spark.sql.jdbc.MsSqlServerDialect
-
- mu() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
- MulticlassClassificationEvaluator - Class in org.apache.spark.ml.evaluation
-
Experimental
Evaluator for multiclass classification, which expects two input columns: prediction and label.
- MulticlassClassificationEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- MulticlassClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- multiclassMetrics() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
- MulticlassMetrics - Class in org.apache.spark.mllib.evaluation
-
Evaluator for multiclass classification.
- MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
- MultilabelMetrics - Class in org.apache.spark.mllib.evaluation
-
Evaluator for multilabel classification.
- MultilabelMetrics(RDD<Tuple2<double[], double[]>>) - Constructor for class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
- multiLabelValidator(int) - Static method in class org.apache.spark.mllib.util.DataValidators
-
Function to check if labels used for k class multi-label classification are
in the range of {0, 1, ..., k - 1}.
- MultilayerPerceptronClassificationModel - Class in org.apache.spark.ml.classification
-
Classification model based on the Multilayer Perceptron.
- MultilayerPerceptronClassifier - Class in org.apache.spark.ml.classification
-
Classifier trainer based on the Multilayer Perceptron.
- MultilayerPerceptronClassifier(String) - Constructor for class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
- MultilayerPerceptronClassifier() - Constructor for class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
- MultilayerPerceptronParams - Interface in org.apache.spark.ml.classification
-
Params for Multilayer Perceptron.
- multiply(DenseMatrix) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Convenience method for Matrix
-DenseMatrix
multiplication.
- multiply(DenseVector) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Convenience method for Matrix
-DenseVector
multiplication.
- multiply(Vector) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Convenience method for Matrix
-Vector
multiplication.
- multiply(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- multiply(BlockMatrix, int) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Multiply this matrix by a local matrix on the right.
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Multiply this matrix by a local matrix on the right.
- multiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Convenience method for Matrix
-DenseMatrix
multiplication.
- multiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Convenience method for Matrix
-DenseVector
multiplication.
- multiply(Vector) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Convenience method for Matrix
-Vector
multiplication.
- multiply(Object) - Method in class org.apache.spark.sql.Column
-
Multiplication of this expression and another expression.
- MultivariateGaussian - Class in org.apache.spark.ml.stat.distribution
-
This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution.
- MultivariateGaussian(Vector, Matrix) - Constructor for class org.apache.spark.ml.stat.distribution.MultivariateGaussian
-
- MultivariateGaussian - Class in org.apache.spark.mllib.stat.distribution
-
Developer API
This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution.
- MultivariateGaussian(Vector, Matrix) - Constructor for class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
- MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat
-
Developer API
MultivariateOnlineSummarizer implements
MultivariateStatisticalSummary
to compute the mean,
variance, minimum, maximum, counts, and nonzero counts for instances in sparse or dense vector
format in an online fashion.
- MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat
-
Trait for multivariate statistical summary of a data matrix.
- MutableAggregationBuffer - Class in org.apache.spark.sql.expressions
-
A Row
representing a mutable aggregation buffer.
- MutableAggregationBuffer() - Constructor for class org.apache.spark.sql.expressions.MutableAggregationBuffer
-
- MutablePair<T1,T2> - Class in org.apache.spark.util
-
Developer API
A tuple of 2 elements.
- MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
-
- MutablePair() - Constructor for class org.apache.spark.util.MutablePair
-
No-arg constructor for serialization
- MutableURLClassLoader - Class in org.apache.spark.util
-
URL class loader that exposes the `addURL` method in URLClassLoader.
- MutableURLClassLoader(URL[], ClassLoader) - Constructor for class org.apache.spark.util.MutableURLClassLoader
-
- myName() - Method in class org.apache.spark.util.InnerClosureFinder
-
- MySQLDialect - Class in org.apache.spark.sql.jdbc
-
- MySQLDialect() - Constructor for class org.apache.spark.sql.jdbc.MySQLDialect
-
- p() - Method in class org.apache.spark.ml.feature.Normalizer
-
Normalization in L^p^ space.
- PagedTable<T> - Interface in org.apache.spark.ui
-
A paged table that will generate a HTML table for a specified page and also the page navigation.
- pageLink(int) - Method in interface org.apache.spark.ui.PagedTable
-
Return a link to jump to a page.
- pageNavigation(int, int, int) - Method in interface org.apache.spark.ui.PagedTable
-
Return a page navigation.
- pageNumberFormField() - Method in interface org.apache.spark.ui.PagedTable
-
- pageRank(double, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
PageRank and edge attributes containing the normalized edge weight.
- PageRank - Class in org.apache.spark.graphx.lib
-
PageRank algorithm implementation.
- PageRank() - Constructor for class org.apache.spark.graphx.lib.PageRank
-
- pageSizeFormField() - Method in interface org.apache.spark.ui.PagedTable
-
- PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream
-
Extra functions available on DStream of (key, value) pairs through an implicit conversion.
- PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
- PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more key-value pair records from each input record.
- PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function
-
A function that returns key-value pairs (Tuple2<K, V>), and can be used to
construct PairRDDs.
- PairRDDFunctions<K,V> - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
- PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
-
- PairwiseRRDD<T> - Class in org.apache.spark.api.r
-
Form an RDD[(Int, Array[Byte])] from key-value pairs returned from R.
- PairwiseRRDD(RDD<T>, int, byte[], String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.PairwiseRRDD
-
- parallelism() - Method in interface org.apache.spark.ml.param.shared.HasParallelism
-
The number of threads to use when running parallel algorithms.
- parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- Param<T> - Class in org.apache.spark.ml.param
-
Developer API
A param with self-contained documentation and optionally default value.
- Param(String, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
-
- Param(Identifiable, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
-
- Param(String, String, String) - Constructor for class org.apache.spark.ml.param.Param
-
- Param(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.Param
-
- param() - Method in class org.apache.spark.ml.param.ParamPair
-
- ParamGridBuilder - Class in org.apache.spark.ml.tuning
-
Builder for a param grid used in grid search-based model selection.
- ParamGridBuilder() - Constructor for class org.apache.spark.ml.tuning.ParamGridBuilder
-
- ParamMap - Class in org.apache.spark.ml.param
-
A param to value map.
- ParamMap() - Constructor for class org.apache.spark.ml.param.ParamMap
-
Creates an empty param map.
- paramMap() - Method in interface org.apache.spark.ml.param.Params
-
Internal param map for user-supplied values.
- ParamPair<T> - Class in org.apache.spark.ml.param
-
A param and its value.
- ParamPair(Param<T>, T) - Constructor for class org.apache.spark.ml.param.ParamPair
-
- Params - Interface in org.apache.spark.ml.param
-
Developer API
Trait for components that take parameters.
- params() - Method in interface org.apache.spark.ml.param.Params
-
Returns all params sorted by their names.
- ParamValidators - Class in org.apache.spark.ml.param
-
Developer API
Factory methods for common validation functions for Param.isValid
.
- ParamValidators() - Constructor for class org.apache.spark.ml.param.ParamValidators
-
- parent() - Method in class org.apache.spark.ml.Model
-
The parent estimator that produced this model.
- parent() - Method in class org.apache.spark.ml.param.Param
-
- parent() - Method in interface org.apache.spark.scheduler.Schedulable
-
- ParentClassLoader - Class in org.apache.spark.util
-
A class loader which makes some protected methods in ClassLoader accessible.
- ParentClassLoader(ClassLoader) - Constructor for class org.apache.spark.util.ParentClassLoader
-
- parentIds() - Method in class org.apache.spark.scheduler.StageInfo
-
- parentIds() - Method in class org.apache.spark.storage.RDDInfo
-
- parentIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Get the parent index of the given node, or 0 if it is the root.
- parmap(Col, String, int, Function1<I, O>, CanBuildFrom<Col, Future<O>, Col>, CanBuildFrom<Col, O, Col>) - Static method in class org.apache.spark.util.ThreadUtils
-
Transforms input collection by applying the given function to each element in parallel fashion.
- parquet(String...) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads a Parquet file, returning the result as a DataFrame
.
- parquet(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads a Parquet file, returning the result as a DataFrame
.
- parquet(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads a Parquet file, returning the result as a DataFrame
.
- parquet(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the DataFrame
in Parquet format at the specified path.
- parquet(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader
-
Loads a Parquet file stream, returning the result as a DataFrame
.
- parquetFile(String...) - Method in class org.apache.spark.sql.SQLContext
-
- parquetFile(Seq<String>) - Method in class org.apache.spark.sql.SQLContext
-
- parse(String) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Parses a string resulted from
Vector.toString
into a
Vector
.
- parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint
-
Parses a string resulted from
LabeledPoint#toString
into
an
LabeledPoint
.
- parse(String) - Static method in class org.apache.spark.mllib.util.NumericParser
-
Parses a string into a Double, an Array[Double], or a Seq[Any].
- parseAll(Parsers.Parser<T>, Reader<Object>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- parseAll(Parsers.Parser<T>, Reader) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- parseAll(Parsers.Parser<T>, CharSequence) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- parseHostPort(String) - Static method in class org.apache.spark.util.Utils
-
- parseIgnoreCase(Class<E>, String) - Static method in class org.apache.spark.util.EnumUtil
-
- Parser(Function1<Reader<Object>, Parsers.ParseResult<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- parseStandaloneMasterUrls(String) - Static method in class org.apache.spark.util.Utils
-
Split the comma delimited string of master URLs into a list.
- PartialResult<R> - Class in org.apache.spark.partial
-
- PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
-
- Partition - Interface in org.apache.spark
-
An identifier for a partition in an RDD.
- partition() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- partition(String) - Method in class org.apache.spark.status.LiveRDD
-
- partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a copy of the RDD partitioned using the specified partitioner.
- partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.Graph
-
Repartitions the edges in the graph according to partitionStrategy
.
- partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.Graph
-
Repartitions the edges in the graph according to partitionStrategy
.
- partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return a copy of the RDD partitioned using the specified partitioner.
- partitionBy(String...) - Method in class org.apache.spark.sql.DataFrameWriter
-
Partitions the output by the given columns on the file system.
- partitionBy(Seq<String>) - Method in class org.apache.spark.sql.DataFrameWriter
-
Partitions the output by the given columns on the file system.
- partitionBy(String, String...) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the partitioning defined.
- partitionBy(Column...) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the partitioning defined.
- partitionBy(String, Seq<String>) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the partitioning defined.
- partitionBy(Seq<Column>) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the partitioning defined.
- partitionBy(String, String...) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
- partitionBy(Column...) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
- partitionBy(String, Seq<String>) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
- partitionBy(Seq<Column>) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
- partitionBy(String...) - Method in class org.apache.spark.sql.streaming.DataStreamWriter
-
Partitions the output by the given columns on the file system.
- partitionBy(Seq<String>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter
-
Partitions the output by the given columns on the file system.
- PartitionCoalescer - Interface in org.apache.spark.rdd
-
::DeveloperApi::
A PartitionCoalescer defines how to coalesce the partitions of a given RDD.
- partitioner() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The partitioner of this RDD.
- partitioner() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
If partitionsRDD
already has a partitioner, use it.
- partitioner() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- Partitioner - Class in org.apache.spark
-
An object that defines how the elements in a key-value pair RDD are partitioned by key.
- Partitioner() - Constructor for class org.apache.spark.Partitioner
-
- partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- partitioner() - Method in class org.apache.spark.rdd.RDD
-
Optionally overridden by subclasses to specify how they are partitioned.
- partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- partitioner() - Method in class org.apache.spark.ShuffleDependency
-
- partitioner(Partitioner) - Method in class org.apache.spark.streaming.StateSpec
-
Set the partitioner by which the state RDDs generated by mapWithState
will be partitioned.
- PartitionGroup - Class in org.apache.spark.rdd
-
::DeveloperApi::
A group of Partition
s
param: prefLoc preferred location for the partition group
- PartitionGroup(Option<String>) - Constructor for class org.apache.spark.rdd.PartitionGroup
-
- partitionId() - Method in class org.apache.spark.BarrierTaskContext
-
- partitionID() - Method in class org.apache.spark.TaskCommitDenied
-
- partitionId() - Method in class org.apache.spark.TaskContext
-
The ID of the RDD partition that is computed by this task.
- Partitioning - Interface in org.apache.spark.sql.sources.v2.reader.partitioning
-
- PartitionLocations(RDD<?>) - Constructor for class org.apache.spark.rdd.DefaultPartitionCoalescer.PartitionLocations
-
- PartitionOffset - Interface in org.apache.spark.sql.sources.v2.reader.streaming
-
Used for per-partition offsets in continuous processing.
- PartitionPruningRDD<T> - Class in org.apache.spark.rdd
-
Developer API
An RDD used to prune RDD partitions/partitions so we can avoid launching tasks on
all partitions.
- PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
-
- partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Set of partitions in this RDD.
- partitions() - Method in class org.apache.spark.rdd.PartitionGroup
-
- partitions() - Method in class org.apache.spark.rdd.RDD
-
Get the array of partitions of this RDD, taking into account whether the
RDD is checkpointed or not.
- partitions() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- partitionsRDD() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- partitionsRDD() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- PartitionStrategy - Interface in org.apache.spark.graphx
-
Represents the way edges are assigned to edge partitions based on their source and destination
vertex IDs.
- PartitionStrategy.CanonicalRandomVertexCut$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical
direction, resulting in a random vertex cut that colocates all edges between two vertices,
regardless of direction.
- PartitionStrategy.EdgePartition1D$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions using only the source vertex ID, colocating edges with the same
source.
- PartitionStrategy.EdgePartition2D$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix,
guaranteeing a 2 * sqrt(numParts)
bound on vertex replication.
- PartitionStrategy.RandomVertexCut$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a
random vertex cut that colocates all same-direction edges between two vertices.
- partsWithLocs() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer.PartitionLocations
-
- partsWithoutLocs() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer.PartitionLocations
-
- path() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- path() - Method in class org.apache.spark.scheduler.SplitInfo
-
- PATH_KEY - Static variable in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
The option key for singular path.
- paths() - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
Returns all the paths specified by both the singular path option and the multiple
paths option.
- PATHS_KEY - Static variable in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
The option key for multiple paths.
- pattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
Regex pattern used to match delimiters if gaps
is true or tokens if gaps
is false.
- pc() - Method in class org.apache.spark.ml.feature.PCAModel
-
- pc() - Method in class org.apache.spark.mllib.feature.PCAModel
-
- PCA - Class in org.apache.spark.ml.feature
-
PCA trains a model to project vectors to a lower dimensional space of the top PCA!.k
principal components.
- PCA(String) - Constructor for class org.apache.spark.ml.feature.PCA
-
- PCA() - Constructor for class org.apache.spark.ml.feature.PCA
-
- PCA - Class in org.apache.spark.mllib.feature
-
A feature transformer that projects vectors to a low-dimensional space using PCA.
- PCA(int) - Constructor for class org.apache.spark.mllib.feature.PCA
-
- PCAModel - Class in org.apache.spark.ml.feature
-
- PCAModel - Class in org.apache.spark.mllib.feature
-
Model fitted by
PCA
that can project vectors to a low-dimensional space using PCA.
- PCAParams - Interface in org.apache.spark.ml.feature
-
- PCAUtil - Class in org.apache.spark.mllib.feature
-
- PCAUtil() - Constructor for class org.apache.spark.mllib.feature.PCAUtil
-
- pdf(Vector) - Method in class org.apache.spark.ml.stat.distribution.MultivariateGaussian
-
Returns density of this multivariate Gaussian at given point, x
- pdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
Returns density of this multivariate Gaussian at given point, x
- PEAK_EXECUTION_MEMORY() - Static method in class org.apache.spark.InternalAccumulator
-
- PEAK_EXECUTION_MEMORY() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- PEAK_EXECUTION_MEMORY() - Static method in class org.apache.spark.ui.ToolTips
-
- PEAK_MEM() - Static method in class org.apache.spark.status.TaskIndexNames
-
- peakExecutionMemory() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- peakExecutionMemory() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- PEARSON() - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
- PearsonCorrelation - Class in org.apache.spark.mllib.stat.correlation
-
Compute Pearson correlation for two RDDs of the type RDD[Double] or the correlation matrix
for an RDD of the type RDD[Vector].
- PearsonCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
- percent_rank() - Static method in class org.apache.spark.sql.functions
-
Window function: returns the relative rank (i.e.
- percentile() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
Percentile of features that selector will select, ordered by statistics value descending.
- percentile() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.graphx.Graph
-
Caches the vertices and edges associated with this graph at the specified storage level,
ignoring any target storage levels previously set.
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
Persists the edge partitions at the specified storage level, ignoring any existing target
storage level.
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
Persists the vertex partitions at the specified storage level, ignoring any existing target
storage level.
- persist(StorageLevel) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Persists the underlying RDD with the specified storage level.
- persist(StorageLevel) - Method in class org.apache.spark.rdd.HadoopRDD
-
- persist(StorageLevel) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist() - Method in class org.apache.spark.rdd.RDD
-
Persist this RDD with the default storage level (MEMORY_ONLY
).
- persist() - Method in class org.apache.spark.sql.Dataset
-
Persist this Dataset with the default storage level (MEMORY_AND_DISK
).
- persist(StorageLevel) - Method in class org.apache.spark.sql.Dataset
-
Persist this Dataset with the given storage level.
- persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist the RDDs of this DStream with the given storage level
- persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist the RDDs of this DStream with the given storage level
- persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist the RDDs of this DStream with the given storage level
- persist() - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- personalizedPageRank(long, double, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run personalized PageRank for a given vertex, such that all random walks
are started relative to the source node.
- phrase(Parsers.Parser<T>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- pi() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-
- pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
-
- pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
-
- pickBin(Partition, RDD<?>, double, DefaultPartitionCoalescer.PartitionLocations) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
Takes a parent RDD partition and decides which of the partition groups to put it in
Takes locality into account, but also uses power of 2 choices to load balance
It strikes a balance between the two using the balanceSlack variable
- pickRandomVertex() - Method in class org.apache.spark.graphx.GraphOps
-
Picks a random vertex from the graph and returns its ID.
- pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>, Map<String, String>, boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>, Map<String, String>, boolean, int, String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(String) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, int, String) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- Pipeline - Class in org.apache.spark.ml
-
A simple pipeline, which acts as an estimator.
- Pipeline(String) - Constructor for class org.apache.spark.ml.Pipeline
-
- Pipeline() - Constructor for class org.apache.spark.ml.Pipeline
-
- Pipeline.SharedReadWrite$ - Class in org.apache.spark.ml
-
- PipelineModel - Class in org.apache.spark.ml
-
Represents a fitted pipeline.
- PipelineStage - Class in org.apache.spark.ml
-
- PipelineStage() - Constructor for class org.apache.spark.ml.PipelineStage
-
- pivot(String) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Pivots a column of the current DataFrame
and performs the specified aggregation.
- pivot(String, Seq<Object>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Pivots a column of the current DataFrame
and performs the specified aggregation.
- pivot(String, List<Object>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
(Java-specific) Pivots a column of the current DataFrame
and performs the specified
aggregation.
- pivot(Column) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Pivots a column of the current DataFrame
and performs the specified aggregation.
- pivot(Column, Seq<Object>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Pivots a column of the current DataFrame
and performs the specified aggregation.
- pivot(Column, List<Object>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
(Java-specific) Pivots a column of the current DataFrame
and performs the specified
aggregation.
- PivotType$() - Constructor for class org.apache.spark.sql.RelationalGroupedDataset.PivotType$
-
- plan() - Method in exception org.apache.spark.sql.AnalysisException
-
- planBatchInputPartitions() - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsScanColumnarBatch
-
- planInputPartitions() - Method in interface org.apache.spark.sql.sources.v2.reader.DataSourceReader
-
- planInputPartitions() - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsScanColumnarBatch
-
- plus(Object) - Method in class org.apache.spark.sql.Column
-
Sum of this expression and another expression.
- plus(Decimal, Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
-
- plus(Duration) - Method in class org.apache.spark.streaming.Duration
-
- plus(Duration) - Method in class org.apache.spark.streaming.Time
-
- pmml() - Method in interface org.apache.spark.mllib.pmml.export.PMMLModelExport
-
Holder of the exported model in PMML format
- PMMLExportable - Interface in org.apache.spark.mllib.pmml
-
Developer API
Export model to the PMML format
Predictive Model Markup Language (PMML) is an XML-based file format
developed by the Data Mining Group (www.dmg.org).
- PMMLKMeansModelWriter - Class in org.apache.spark.ml.clustering
-
A writer for KMeans that handles the "pmml" format
- PMMLKMeansModelWriter() - Constructor for class org.apache.spark.ml.clustering.PMMLKMeansModelWriter
-
- PMMLLinearRegressionModelWriter - Class in org.apache.spark.ml.regression
-
A writer for LinearRegression that handles the "pmml" format
- PMMLLinearRegressionModelWriter() - Constructor for class org.apache.spark.ml.regression.PMMLLinearRegressionModelWriter
-
- PMMLModelExport - Interface in org.apache.spark.mllib.pmml.export
-
- PMMLModelExportFactory - Class in org.apache.spark.mllib.pmml.export
-
- PMMLModelExportFactory() - Constructor for class org.apache.spark.mllib.pmml.export.PMMLModelExportFactory
-
- pmod(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the positive value of dividend mod divisor.
- point() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- POINTS() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- pointSilhouetteCoefficient(Set<Object>, double, long, Function1<Object, Object>) - Static method in class org.apache.spark.ml.evaluation.CosineSilhouette
-
- pointSilhouetteCoefficient(Set<Object>, double, long, Function1<Object, Object>) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette
-
- POISON_PILL() - Static method in class org.apache.spark.scheduler.AsyncEventQueue
-
- Poisson$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
-
- PoissonBounds - Class in org.apache.spark.util.random
-
Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact
sample sizes with high confidence when sampling with replacement.
- PoissonBounds() - Constructor for class org.apache.spark.util.random.PoissonBounds
-
- PoissonGenerator - Class in org.apache.spark.mllib.random
-
Developer API
Generates i.i.d.
- PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
-
- poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Java-friendly version of RandomRDDs.poissonRDD
.
- poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.poissonJavaRDD
with the default seed.
- poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.poissonJavaRDD
with the default number of partitions and the default seed.
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Java-friendly version of RandomRDDs.poissonVectorRDD
.
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.poissonJavaVectorRDD
with the default seed.
- poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
RandomRDDs.poissonJavaVectorRDD
with the default number of partitions and the default seed.
- poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the Poisson distribution with the input
mean.
- PoissonSampler<T> - Class in org.apache.spark.util.random
-
Developer API
A sampler for sampling with replacement, based on values drawn from Poisson distribution.
- PoissonSampler(double, boolean) - Constructor for class org.apache.spark.util.random.PoissonSampler
-
- PoissonSampler(double) - Constructor for class org.apache.spark.util.random.PoissonSampler
-
- poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from the
Poisson distribution with the input mean.
- PolynomialExpansion - Class in org.apache.spark.ml.feature
-
Perform feature expansion in a polynomial space.
- PolynomialExpansion(String) - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
-
- PolynomialExpansion() - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
-
- popStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the population standard deviation of this RDD's elements.
- popStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the population standard deviation of this RDD's elements.
- popStdev() - Method in class org.apache.spark.util.StatCounter
-
Return the population standard deviation of the values.
- popVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the population variance of this RDD's elements.
- popVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the population variance of this RDD's elements.
- popVariance() - Method in class org.apache.spark.util.StatCounter
-
Return the population variance of the values.
- port() - Method in interface org.apache.spark.SparkExecutorInfo
-
- port() - Method in class org.apache.spark.SparkExecutorInfoImpl
-
- port() - Method in class org.apache.spark.storage.BlockManagerId
-
- PortableDataStream - Class in org.apache.spark.input
-
A class that allows DataStreams to be serialized and moved around by not creating them
until they need to be read
- PortableDataStream(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.PortableDataStream
-
- portMaxRetries(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Maximum number of retries when binding to a port before giving up.
- posexplode(Column) - Static method in class org.apache.spark.sql.functions
-
Creates a new row for each element with position in the given array or map column.
- posexplode_outer(Column) - Static method in class org.apache.spark.sql.functions
-
Creates a new row for each element with position in the given array or map column.
- position() - Method in class org.apache.spark.storage.ReadableChannelFileRegion
-
- positioned(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- post(SparkListenerEvent) - Method in class org.apache.spark.scheduler.AsyncEventQueue
-
- Postfix$() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan.Postfix$
-
- PostgresDialect - Class in org.apache.spark.sql.jdbc
-
- PostgresDialect() - Constructor for class org.apache.spark.sql.jdbc.PostgresDialect
-
- postStartHook() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- postToAll(E) - Method in interface org.apache.spark.util.ListenerBus
-
Post the event to all registered listeners.
- pow(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(Column, String) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(String, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(String, String) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(Column, double) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(String, double) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(double, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(double, String) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- POW_10() - Static method in class org.apache.spark.sql.types.Decimal
-
- PowerIterationClustering - Class in org.apache.spark.ml.clustering
-
Experimental
Power Iteration Clustering (PIC), a scalable graph clustering algorithm developed by
Lin and Cohen.
- PowerIterationClustering() - Constructor for class org.apache.spark.ml.clustering.PowerIterationClustering
-
- PowerIterationClustering - Class in org.apache.spark.mllib.clustering
-
Power Iteration Clustering (PIC), a scalable graph clustering algorithm developed by
Lin and Cohen.
- PowerIterationClustering() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Constructs a PIC instance with default parameters: {k: 2, maxIterations: 100,
initMode: "random"}.
- PowerIterationClustering.Assignment - Class in org.apache.spark.mllib.clustering
-
Cluster assignment.
- PowerIterationClustering.Assignment$ - Class in org.apache.spark.mllib.clustering
-
- PowerIterationClusteringModel - Class in org.apache.spark.mllib.clustering
-
- PowerIterationClusteringModel(int, RDD<PowerIterationClustering.Assignment>) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- PowerIterationClusteringModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.clustering
-
- PowerIterationClusteringParams - Interface in org.apache.spark.ml.clustering
-
Common params for PowerIterationClustering
- pr() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-
Returns the precision-recall curve, which is a Dataframe containing
two fields recall, precision with (0.0, 1.0) prepended to it.
- pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the precision-recall curve, which is an RDD of (recall, precision),
NOT (precision, recall), with (0.0, p) prepended to it, where p is the precision
associated with the lowest recall on the curve.
- preciseSize() - Method in interface org.apache.spark.storage.memory.MemoryEntryBuilder
-
- Precision - Class in org.apache.spark.mllib.evaluation.binary
-
Precision.
- Precision() - Constructor for class org.apache.spark.mllib.evaluation.binary.Precision
-
- precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns precision for a given label (category)
- precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
- precision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns document-based precision averaged by the number of documents
- precision(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns precision for a given label (category)
- precision() - Method in class org.apache.spark.sql.types.Decimal
-
- precision() - Method in class org.apache.spark.sql.types.DecimalType
-
- precisionAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
-
Compute the average precision of all the queries, truncated at ranking position k.
- precisionByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Returns precision for each label (category).
- precisionByThreshold() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-
Returns a dataframe with two fields (threshold, precision) curve.
- precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, precision) curve.
- predict(Vector) - Method in interface org.apache.spark.ml.ann.TopologyModel
-
Prediction of the model.
- predict(FeaturesType) - Method in class org.apache.spark.ml.classification.ClassificationModel
-
Predict label for the given features.
- predict(Vector) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- predict(Vector) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- predict(Vector) - Method in class org.apache.spark.ml.classification.LinearSVCModel
-
- predict(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
Predict label for the given feature vector.
- predict(Vector) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
-
Predict label for the given features.
- predict(FeaturesType) - Method in class org.apache.spark.ml.PredictionModel
-
Predict label for the given features.
- predict(Vector) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- predict(Vector) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- predict(Vector) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- predict(Vector) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
-
- predict(Vector) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- predict(Vector) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for a single data point using the model trained.
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for examples stored in a JavaRDD.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- predict(Vector) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
Predicts the index of the cluster that the input point belongs to.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
Predicts the indices of the clusters that the input points belong to.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
Java-friendly version of predict()
.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Maps given points to their cluster indices.
- predict(Vector) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Maps given point to its cluster index.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Java-friendly version of predict()
- predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Returns the cluster index that a given point belongs to.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Maps given points to their cluster indices.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Maps given points to their cluster indices.
- predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Predict the rating of one user for one product.
- predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Predict the rating of many users for many products.
- predict(JavaPairRDD<Integer, Integer>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Java-friendly version of MatrixFactorizationModel.predict
.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Predict values for a single data point using the model trained.
- predict(RDD<Object>) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
Predict labels for provided features.
- predict(JavaDoubleRDD) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
Predict labels for provided features.
- predict(double) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
Predict a single label.
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for a single data point using the model trained.
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for examples stored in a JavaRDD.
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for a single data point using the model trained.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for the given data set using the model trained.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for the given data set using the model trained.
- predict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- predict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- predict() - Method in class org.apache.spark.mllib.tree.model.Node
-
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node
-
predict value if node is not leaf
- Predict - Class in org.apache.spark.mllib.tree.model
-
Developer API
Predicted value for a node
param: predict predicted value
param: prob probability of the label (classification only)
- Predict(double, double) - Constructor for class org.apache.spark.mllib.tree.model.Predict
-
- predict() - Method in class org.apache.spark.mllib.tree.model.Predict
-
- PredictData(double, double) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- PredictData$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData$
-
- prediction() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
-
- prediction() - Method in class org.apache.spark.ml.tree.InternalNode
-
- prediction() - Method in class org.apache.spark.ml.tree.LeafNode
-
- prediction() - Method in class org.apache.spark.ml.tree.Node
-
Prediction a leaf node makes, or which an internal node would make if it were a leaf node
- predictionCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Field in "predictions" which gives the prediction of each class.
- predictionCol() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
-
- predictionCol() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
-
- predictionCol() - Method in interface org.apache.spark.ml.param.shared.HasPredictionCol
-
Param for prediction column name.
- predictionCol() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
Field in "predictions" which gives the predicted value of each instance.
- predictionCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
- PredictionModel<FeaturesType,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml
-
Developer API
Abstraction for a model for prediction tasks (regression and classification).
- PredictionModel() - Constructor for class org.apache.spark.ml.PredictionModel
-
- predictions() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Dataframe output by the model's transform
method.
- predictions() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
-
- predictions() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
-
- predictions() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
Predictions output by the model's transform
method.
- predictions() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-
Predictions associated with the boundaries at the same index, monotone because of isotonic
regression.
- predictions() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
- predictions() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Use the clustering model to make predictions on batches of data from a DStream.
- predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Java-friendly version of predictOn
.
- predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Use the model to make predictions on batches of data from a DStream
- predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Java-friendly version of predictOn
.
- predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Use the model to make predictions on the values of a DStream and carry over its keys.
- predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Java-friendly version of predictOnValues
.
- predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Use the model to make predictions on the values of a DStream and carry over its keys.
- predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Java-friendly version of predictOnValues
.
- Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml
-
Developer API
Abstraction for prediction problems (regression and classification).
- Predictor() - Constructor for class org.apache.spark.ml.Predictor
-
- PredictorParams - Interface in org.apache.spark.ml
-
(private[ml]) Trait for parameters for prediction (regression and classification).
- predictProbabilities(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
Predict values for the given data set using the model trained.
- predictProbabilities(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
Predict posterior class probabilities for a single data point using the model trained.
- predictQuantiles(Vector) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- predictRaw(Vector) - Method in interface org.apache.spark.ml.ann.TopologyModel
-
Raw prediction of the model.
- predictSoft(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Given the input vectors, return the membership value of each vector
to all mixture components.
- predictSoft(Vector) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Given the input vector, return the membership values to all mixture components.
- preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Override this to specify a preferred location (hostname).
- preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD
-
Get the preferred locations of a partition, taking into account whether the
RDD is checkpointed.
- preferredLocations() - Method in interface org.apache.spark.sql.sources.v2.reader.InputPartition
-
The preferred locations where the input partition reader returned by this partition can run
faster, but Spark does not guarantee to run the input partition reader on these locations.
- Prefix$() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan.Prefix$
-
- prefixesToRewrite() - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
-
- PrefixSpan - Class in org.apache.spark.ml.fpm
-
Experimental
A parallel PrefixSpan algorithm to mine frequent sequential patterns.
- PrefixSpan(String) - Constructor for class org.apache.spark.ml.fpm.PrefixSpan
-
- PrefixSpan() - Constructor for class org.apache.spark.ml.fpm.PrefixSpan
-
- PrefixSpan - Class in org.apache.spark.mllib.fpm
-
A parallel PrefixSpan algorithm to mine frequent sequential patterns.
- PrefixSpan() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan
-
Constructs a default instance with default parameters
{minSupport: 0.1
, maxPatternLength: 10
, maxLocalProjDBSize: 32000000L
}.
- PrefixSpan.FreqSequence<Item> - Class in org.apache.spark.mllib.fpm
-
Represents a frequent sequence.
- PrefixSpan.Postfix$ - Class in org.apache.spark.mllib.fpm
-
- PrefixSpan.Prefix$ - Class in org.apache.spark.mllib.fpm
-
- PrefixSpanModel<Item> - Class in org.apache.spark.mllib.fpm
-
Model fitted by
PrefixSpan
param: freqSequences frequent sequences
- PrefixSpanModel(RDD<PrefixSpan.FreqSequence<Item>>) - Constructor for class org.apache.spark.mllib.fpm.PrefixSpanModel
-
- PrefixSpanModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.fpm
-
- prefLoc() - Method in class org.apache.spark.rdd.PartitionGroup
-
- pregel(A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<A>) - Method in class org.apache.spark.graphx.GraphOps
-
Execute a Pregel-like iterative vertex-parallel abstraction.
- Pregel - Class in org.apache.spark.graphx
-
Implements a Pregel-like bulk-synchronous message-passing API.
- Pregel() - Constructor for class org.apache.spark.graphx.Pregel
-
- prepareWritable(Writable, Seq<Tuple2<String, String>>) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- prepareWrite(SparkSession, Job, Map<String, String>, StructType) - Method in class org.apache.spark.sql.hive.execution.HiveFileFormat
-
- prepareWrite(SparkSession, Job, Map<String, String>, StructType) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- prependBaseUri(HttpServletRequest, String, String) - Static method in class org.apache.spark.ui.UIUtils
-
- prettyJson() - Method in class org.apache.spark.sql.streaming.SinkProgress
-
The pretty (i.e.
- prettyJson() - Method in class org.apache.spark.sql.streaming.SourceProgress
-
The pretty (i.e.
- prettyJson() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress
-
The pretty (i.e.
- prettyJson() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
The pretty (i.e.
- prettyJson() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus
-
The pretty (i.e.
- prettyJson() - Static method in class org.apache.spark.sql.types.BinaryType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.BooleanType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.ByteType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- prettyJson() - Method in class org.apache.spark.sql.types.DataType
-
The pretty (i.e.
- prettyJson() - Static method in class org.apache.spark.sql.types.DateType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.DoubleType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.FloatType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.IntegerType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.LongType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.NullType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.ShortType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.StringType
-
- prettyJson() - Static method in class org.apache.spark.sql.types.TimestampType
-
- prettyPrint() - Method in class org.apache.spark.streaming.Duration
-
- prev() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- prev() - Method in class org.apache.spark.status.LiveRDDPartition
-
- prevPageSizeFormField() - Method in interface org.apache.spark.ui.PagedTable
-
- print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Print the first ten elements of each RDD generated in this DStream.
- print(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Print the first num elements of each RDD generated in this DStream.
- print() - Method in class org.apache.spark.streaming.dstream.DStream
-
Print the first ten elements of each RDD generated in this DStream.
- print(int) - Method in class org.apache.spark.streaming.dstream.DStream
-
Print the first num elements of each RDD generated in this DStream.
- printErrorAndExit(String) - Method in interface org.apache.spark.util.CommandLineUtils
-
- printMessage(String) - Method in interface org.apache.spark.util.CommandLineUtils
-
- printSchema() - Method in class org.apache.spark.sql.Dataset
-
Prints the schema to the console in a nice tree format.
- printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-
- printStream() - Method in interface org.apache.spark.util.CommandLineUtils
-
- printTreeString() - Method in class org.apache.spark.sql.types.StructType
-
- prioritize(BlockManagerId, Seq<BlockManagerId>, HashSet<BlockManagerId>, BlockId, int) - Method in class org.apache.spark.storage.BasicBlockReplicationPolicy
-
Method to prioritize a bunch of candidate peers of a block manager.
- prioritize(BlockManagerId, Seq<BlockManagerId>, HashSet<BlockManagerId>, BlockId, int) - Method in interface org.apache.spark.storage.BlockReplicationPolicy
-
Method to prioritize a bunch of candidate peers of a block
- prioritize(BlockManagerId, Seq<BlockManagerId>, HashSet<BlockManagerId>, BlockId, int) - Method in class org.apache.spark.storage.RandomBlockReplicationPolicy
-
Method to prioritize a bunch of candidate peers of a block.
- priority() - Method in interface org.apache.spark.scheduler.Schedulable
-
- prob() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- prob() - Method in class org.apache.spark.mllib.tree.model.Predict
-
- ProbabilisticClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
Developer API
- ProbabilisticClassificationModel() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-
- ProbabilisticClassifier<FeaturesType,E extends ProbabilisticClassifier<FeaturesType,E,M>,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
Developer API
- ProbabilisticClassifier() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassifier
-
- ProbabilisticClassifierParams - Interface in org.apache.spark.ml.classification
-
(private[classification]) Params for probabilistic classification.
- probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- probability() - Method in class org.apache.spark.ml.clustering.GaussianMixtureSummary
-
Probability of each cluster.
- probabilityCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Field in "predictions" which gives the probability of each class as a vector.
- probabilityCol() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
-
- probabilityCol() - Method in class org.apache.spark.ml.clustering.GaussianMixtureSummary
-
- probabilityCol() - Method in interface org.apache.spark.ml.param.shared.HasProbabilityCol
-
Param for Column name for predicted class conditional probabilities.
- Probit$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Probit$
-
- process(T) - Method in class org.apache.spark.sql.ForeachWriter
-
Called to process the data in the executor side.
- PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- processAllAvailable() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
Blocks until all available data in the source has been processed and committed to the sink.
- processedRowsPerSecond() - Method in class org.apache.spark.sql.streaming.SourceProgress
-
- processedRowsPerSecond() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
The aggregate (across all sources) rate at which Spark is processing data.
- processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for the all jobs of this batch to finish processing from the time they started
processing.
- processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- ProcessingTime - Class in org.apache.spark.sql.streaming
-
- ProcessingTime(long) - Constructor for class org.apache.spark.sql.streaming.ProcessingTime
-
Deprecated.
- ProcessingTime(long) - Static method in class org.apache.spark.sql.streaming.Trigger
-
A trigger policy that runs a query periodically based on an interval in processing time.
- ProcessingTime(long, TimeUnit) - Static method in class org.apache.spark.sql.streaming.Trigger
-
(Java-friendly)
A trigger policy that runs a query periodically based on an interval in processing time.
- ProcessingTime(Duration) - Static method in class org.apache.spark.sql.streaming.Trigger
-
(Scala-friendly)
A trigger policy that runs a query periodically based on an interval in processing time.
- ProcessingTime(String) - Static method in class org.apache.spark.sql.streaming.Trigger
-
A trigger policy that runs a query periodically based on an interval in processing time.
- processingTime() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
-
- ProcessingTimeTimeout() - Static method in class org.apache.spark.sql.streaming.GroupStateTimeout
-
Timeout based on processing time.
- processStreamByLine(String, InputStream, Function1<String, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Return and start a daemon thread that processes the content of the input stream line by line.
- producedAttributes() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
-
- product() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- product(TypeTags.TypeTag<T>) - Static method in class org.apache.spark.sql.Encoders
-
An encoder for Scala's product type (tuples, case classes, etc).
- productArity() - Static method in class org.apache.spark.ExpireDeadHosts
-
- productArity() - Static method in class org.apache.spark.ml.feature.Dot
-
- productArity() - Static method in class org.apache.spark.Resubmitted
-
- productArity() - Static method in class org.apache.spark.rpc.netty.OnStart
-
- productArity() - Static method in class org.apache.spark.rpc.netty.OnStop
-
- productArity() - Static method in class org.apache.spark.scheduler.AllJobsCancelled
-
- productArity() - Static method in class org.apache.spark.scheduler.JobSucceeded
-
- productArity() - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
-
- productArity() - Static method in class org.apache.spark.scheduler.StopCoordinator
-
- productArity() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- productArity() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- productArity() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- productArity() - Static method in class org.apache.spark.sql.types.BinaryType
-
- productArity() - Static method in class org.apache.spark.sql.types.BooleanType
-
- productArity() - Static method in class org.apache.spark.sql.types.ByteType
-
- productArity() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- productArity() - Static method in class org.apache.spark.sql.types.DateType
-
- productArity() - Static method in class org.apache.spark.sql.types.DoubleType
-
- productArity() - Static method in class org.apache.spark.sql.types.FloatType
-
- productArity() - Static method in class org.apache.spark.sql.types.IntegerType
-
- productArity() - Static method in class org.apache.spark.sql.types.LongType
-
- productArity() - Static method in class org.apache.spark.sql.types.NullType
-
- productArity() - Static method in class org.apache.spark.sql.types.ShortType
-
- productArity() - Static method in class org.apache.spark.sql.types.StringType
-
- productArity() - Static method in class org.apache.spark.sql.types.TimestampType
-
- productArity() - Static method in class org.apache.spark.StopMapOutputTracker
-
- productArity() - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
-
- productArity() - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
-
- productArity() - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
-
- productArity() - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
-
- productArity() - Static method in class org.apache.spark.Success
-
- productArity() - Static method in class org.apache.spark.TaskResultLost
-
- productArity() - Static method in class org.apache.spark.TaskSchedulerIsSet
-
- productArity() - Static method in class org.apache.spark.UnknownReason
-
- productElement(int) - Static method in class org.apache.spark.ExpireDeadHosts
-
- productElement(int) - Static method in class org.apache.spark.ml.feature.Dot
-
- productElement(int) - Static method in class org.apache.spark.Resubmitted
-
- productElement(int) - Static method in class org.apache.spark.rpc.netty.OnStart
-
- productElement(int) - Static method in class org.apache.spark.rpc.netty.OnStop
-
- productElement(int) - Static method in class org.apache.spark.scheduler.AllJobsCancelled
-
- productElement(int) - Static method in class org.apache.spark.scheduler.JobSucceeded
-
- productElement(int) - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
-
- productElement(int) - Static method in class org.apache.spark.scheduler.StopCoordinator
-
- productElement(int) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- productElement(int) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- productElement(int) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- productElement(int) - Static method in class org.apache.spark.sql.types.BinaryType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.BooleanType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.ByteType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.DateType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.DoubleType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.FloatType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.IntegerType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.LongType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.NullType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.ShortType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.StringType
-
- productElement(int) - Static method in class org.apache.spark.sql.types.TimestampType
-
- productElement(int) - Static method in class org.apache.spark.StopMapOutputTracker
-
- productElement(int) - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
-
- productElement(int) - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
-
- productElement(int) - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
-
- productElement(int) - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
-
- productElement(int) - Static method in class org.apache.spark.Success
-
- productElement(int) - Static method in class org.apache.spark.TaskResultLost
-
- productElement(int) - Static method in class org.apache.spark.TaskSchedulerIsSet
-
- productElement(int) - Static method in class org.apache.spark.UnknownReason
-
- productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- productIterator() - Static method in class org.apache.spark.ExpireDeadHosts
-
- productIterator() - Static method in class org.apache.spark.ml.feature.Dot
-
- productIterator() - Static method in class org.apache.spark.Resubmitted
-
- productIterator() - Static method in class org.apache.spark.rpc.netty.OnStart
-
- productIterator() - Static method in class org.apache.spark.rpc.netty.OnStop
-
- productIterator() - Static method in class org.apache.spark.scheduler.AllJobsCancelled
-
- productIterator() - Static method in class org.apache.spark.scheduler.JobSucceeded
-
- productIterator() - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
-
- productIterator() - Static method in class org.apache.spark.scheduler.StopCoordinator
-
- productIterator() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- productIterator() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- productIterator() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- productIterator() - Static method in class org.apache.spark.sql.types.BinaryType
-
- productIterator() - Static method in class org.apache.spark.sql.types.BooleanType
-
- productIterator() - Static method in class org.apache.spark.sql.types.ByteType
-
- productIterator() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- productIterator() - Static method in class org.apache.spark.sql.types.DateType
-
- productIterator() - Static method in class org.apache.spark.sql.types.DoubleType
-
- productIterator() - Static method in class org.apache.spark.sql.types.FloatType
-
- productIterator() - Static method in class org.apache.spark.sql.types.IntegerType
-
- productIterator() - Static method in class org.apache.spark.sql.types.LongType
-
- productIterator() - Static method in class org.apache.spark.sql.types.NullType
-
- productIterator() - Static method in class org.apache.spark.sql.types.ShortType
-
- productIterator() - Static method in class org.apache.spark.sql.types.StringType
-
- productIterator() - Static method in class org.apache.spark.sql.types.TimestampType
-
- productIterator() - Static method in class org.apache.spark.StopMapOutputTracker
-
- productIterator() - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
-
- productIterator() - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
-
- productIterator() - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
-
- productIterator() - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
-
- productIterator() - Static method in class org.apache.spark.Success
-
- productIterator() - Static method in class org.apache.spark.TaskResultLost
-
- productIterator() - Static method in class org.apache.spark.TaskSchedulerIsSet
-
- productIterator() - Static method in class org.apache.spark.UnknownReason
-
- productPrefix() - Static method in class org.apache.spark.ExpireDeadHosts
-
- productPrefix() - Static method in class org.apache.spark.ml.feature.Dot
-
- productPrefix() - Static method in class org.apache.spark.Resubmitted
-
- productPrefix() - Static method in class org.apache.spark.rpc.netty.OnStart
-
- productPrefix() - Static method in class org.apache.spark.rpc.netty.OnStop
-
- productPrefix() - Static method in class org.apache.spark.scheduler.AllJobsCancelled
-
- productPrefix() - Static method in class org.apache.spark.scheduler.JobSucceeded
-
- productPrefix() - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
-
- productPrefix() - Static method in class org.apache.spark.scheduler.StopCoordinator
-
- productPrefix() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- productPrefix() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
-
- productPrefix() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
-
- productPrefix() - Static method in class org.apache.spark.sql.types.BinaryType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.BooleanType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.ByteType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.DateType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.DoubleType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.FloatType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.IntegerType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.LongType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.NullType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.ShortType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.StringType
-
- productPrefix() - Static method in class org.apache.spark.sql.types.TimestampType
-
- productPrefix() - Static method in class org.apache.spark.StopMapOutputTracker
-
- productPrefix() - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
-
- productPrefix() - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
-
- productPrefix() - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
-
- productPrefix() - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
-
- productPrefix() - Static method in class org.apache.spark.Success
-
- productPrefix() - Static method in class org.apache.spark.TaskResultLost
-
- productPrefix() - Static method in class org.apache.spark.TaskSchedulerIsSet
-
- productPrefix() - Static method in class org.apache.spark.UnknownReason
-
- progress() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryProgressEvent
-
- project(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
-
- project(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
-
- properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- propertiesFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- propertiesToJson(Properties) - Static method in class org.apache.spark.util.JsonProtocol
-
- provider() - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
-
- provider() - Method in interface org.apache.spark.streaming.kinesis.SparkAWSCredentials
-
Return an AWSCredentialProvider instance that can be used by the Kinesis Client
Library to authenticate to AWS services (Kinesis, CloudWatch and DynamoDB).
- proxyBase() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
-
- pruneColumns(StructType) - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsPushDownRequiredColumns
-
Applies column pruning w.r.t.
- PrunedFilteredScan - Interface in org.apache.spark.sql.sources
-
A BaseRelation that can eliminate unneeded columns and filter using selected
predicates before producing an RDD containing all matching tuples as Row objects.
- PrunedScan - Interface in org.apache.spark.sql.sources
-
A BaseRelation that can eliminate unneeded columns before producing an RDD
containing all of its tuples as Row objects.
- Pseudorandom - Interface in org.apache.spark.util.random
-
Developer API
A class with pseudorandom behavior.
- pushedFilters() - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsPushDownFilters
-
- pushFilters(Filter[]) - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsPushDownFilters
-
Pushes down filters, and returns filters that need to be evaluated after scanning.
- put(ParamPair<?>...) - Method in class org.apache.spark.ml.param.ParamMap
-
Puts a list of param pairs (overwrites if the input params exists).
- put(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap
-
Puts a (param, value) pair (overwrites if the input param exists).
- put(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.param.ParamMap
-
Puts a list of param pairs (overwrites if the input params exists).
- put(Object) - Method in class org.apache.spark.util.sketch.BloomFilter
-
Puts an item into this BloomFilter
.
- putBinary(byte[]) - Method in class org.apache.spark.util.sketch.BloomFilter
-
- putBoolean(String, boolean) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Boolean.
- putBooleanArray(String, boolean[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Boolean array.
- putDouble(String, double) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Double.
- putDoubleArray(String, double[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Double array.
- putLong(String, long) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Long.
- putLong(long) - Method in class org.apache.spark.util.sketch.BloomFilter
-
- putLongArray(String, long[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Long array.
- putMetadata(String, Metadata) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
- putMetadataArray(String, Metadata[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
- putNull(String) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a null.
- putString(String, String) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a String.
- putString(String) - Method in class org.apache.spark.util.sketch.BloomFilter
-
- putStringArray(String, String[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a String array.
- pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- pValue() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
-
- pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
The probability of obtaining a test statistic result at least as extreme as the one that was
actually observed, assuming that the null hypothesis is true.
- pValues() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary
-
Two-sided p-value of estimated coefficients and intercept.
- pValues() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
Two-sided p-value of estimated coefficients and intercept.
- PythonStreamingListener - Interface in org.apache.spark.streaming.api.java
-
- pyUDT() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- R() - Method in class org.apache.spark.mllib.linalg.QRDecomposition
-
- r2() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
Returns R^2^, the coefficient of determination.
- r2() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns R^2^, the unadjusted coefficient of determination.
- r2adj() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
Returns Adjusted R^2^, the adjusted coefficient of determination.
- RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- radians(Column) - Static method in class org.apache.spark.sql.functions
-
Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
- radians(String) - Static method in class org.apache.spark.sql.functions
-
Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
- rand(int, int, Random) - Static method in class org.apache.spark.ml.linalg.DenseMatrix
-
Generate a DenseMatrix
consisting of i.i.d.
uniform random numbers.
- rand(int, int, Random) - Static method in class org.apache.spark.ml.linalg.Matrices
-
Generate a DenseMatrix
consisting of i.i.d.
uniform random numbers.
- rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a DenseMatrix
consisting of i.i.d.
uniform random numbers.
- rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a DenseMatrix
consisting of i.i.d.
uniform random numbers.
- rand(long) - Static method in class org.apache.spark.sql.functions
-
Generate a random column with independent and identically distributed (i.i.d.) samples
uniformly distributed in [0.0, 1.0).
- rand() - Static method in class org.apache.spark.sql.functions
-
Generate a random column with independent and identically distributed (i.i.d.) samples
uniformly distributed in [0.0, 1.0).
- randn(int, int, Random) - Static method in class org.apache.spark.ml.linalg.DenseMatrix
-
Generate a DenseMatrix
consisting of i.i.d.
gaussian random numbers.
- randn(int, int, Random) - Static method in class org.apache.spark.ml.linalg.Matrices
-
Generate a DenseMatrix
consisting of i.i.d.
gaussian random numbers.
- randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a DenseMatrix
consisting of i.i.d.
gaussian random numbers.
- randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a DenseMatrix
consisting of i.i.d.
gaussian random numbers.
- randn(long) - Static method in class org.apache.spark.sql.functions
-
Generate a column with independent and identically distributed (i.i.d.) samples from
the standard normal distribution.
- randn() - Static method in class org.apache.spark.sql.functions
-
Generate a column with independent and identically distributed (i.i.d.) samples from
the standard normal distribution.
- random() - Method in class org.apache.spark.ml.image.SamplePathFilter
-
- RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
-
- random() - Static method in class org.apache.spark.util.Utils
-
- RandomBlockReplicationPolicy - Class in org.apache.spark.storage
-
- RandomBlockReplicationPolicy() - Constructor for class org.apache.spark.storage.RandomBlockReplicationPolicy
-
- RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random
-
Developer API
Trait for random data generators that generate i.i.d.
- RandomForest - Class in org.apache.spark.ml.tree.impl
-
ALGORITHM
- RandomForest() - Constructor for class org.apache.spark.ml.tree.impl.RandomForest
-
- RandomForest - Class in org.apache.spark.mllib.tree
-
A class that implements a
Random Forest
learning algorithm for classification and regression.
- RandomForest(Strategy, int, String, int) - Constructor for class org.apache.spark.mllib.tree.RandomForest
-
- RandomForestClassificationModel - Class in org.apache.spark.ml.classification
-
- RandomForestClassifier - Class in org.apache.spark.ml.classification
-
- RandomForestClassifier(String) - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
-
- RandomForestClassifier() - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
-
- RandomForestClassifierParams - Interface in org.apache.spark.ml.tree
-
- RandomForestModel - Class in org.apache.spark.mllib.tree.model
-
Represents a random forest model.
- RandomForestModel(Enumeration.Value, DecisionTreeModel[]) - Constructor for class org.apache.spark.mllib.tree.model.RandomForestModel
-
- RandomForestParams - Interface in org.apache.spark.ml.tree
-
Parameters for Random Forest algorithms.
- RandomForestRegressionModel - Class in org.apache.spark.ml.regression
-
- RandomForestRegressor - Class in org.apache.spark.ml.regression
-
- RandomForestRegressor(String) - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
-
- RandomForestRegressor() - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
-
- RandomForestRegressorParams - Interface in org.apache.spark.ml.tree
-
- randomize(TraversableOnce<T>, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
-
Shuffle the elements of a collection into a random order, returning the
result in a new collection.
- randomizeInPlace(Object, Random) - Static method in class org.apache.spark.util.Utils
-
Shuffle the elements of an array into a random order, modifying the
original array.
- randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Developer API
Generates an RDD comprised of i.i.d.
samples produced by the input RandomDataGenerator.
- randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Developer API
RandomRDDs.randomJavaRDD
with the default seed.
- randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Developer API
RandomRDDs.randomJavaRDD
with the default seed & numPartitions
- randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Developer API
Java-friendly version of RandomRDDs.randomVectorRDD
.
- randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Developer API
RandomRDDs.randomJavaVectorRDD
with the default seed.
- randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Developer API
RandomRDDs.randomJavaVectorRDD
with the default number of partitions and the default seed.
- randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Developer API
Generates an RDD comprised of i.i.d.
samples produced by the input RandomDataGenerator.
- RandomRDDs - Class in org.apache.spark.mllib.random
-
Generator methods for creating RDDs comprised of i.i.d.
samples from some distribution.
- RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
-
- RandomSampler<T,U> - Interface in org.apache.spark.util.random
-
Developer API
A pseudorandom sampler.
- randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.sql.Dataset
-
Randomly splits this Dataset with the provided weights.
- randomSplit(double[]) - Method in class org.apache.spark.sql.Dataset
-
Randomly splits this Dataset with the provided weights.
- randomSplitAsList(double[], long) - Method in class org.apache.spark.sql.Dataset
-
Returns a Java list that contains randomly split Dataset with the provided weights.
- randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Developer API
Generates an RDD[Vector] with vectors containing i.i.d.
samples produced by the
input RandomDataGenerator.
- RandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-
- range(long, long, long, int) - Method in class org.apache.spark.SparkContext
-
Creates a new RDD[Long] containing elements from start
to end
(exclusive), increased by
step
every element.
- range(long) - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Creates a
Dataset
with a single
LongType
column named
id
, containing elements
in a range from 0 to
end
(exclusive) with step value 1.
- range(long, long) - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Creates a
Dataset
with a single
LongType
column named
id
, containing elements
in a range from
start
to
end
(exclusive) with step value 1.
- range(long, long, long) - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Creates a
Dataset
with a single
LongType
column named
id
, containing elements
in a range from
start
to
end
(exclusive) with a step value.
- range(long, long, long, int) - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Creates a
Dataset
with a single
LongType
column named
id
, containing elements
in a range from
start
to
end
(exclusive) with a step value, with partition number
specified.
- range(long) - Method in class org.apache.spark.sql.SQLContext
-
- range(long, long) - Method in class org.apache.spark.sql.SQLContext
-
- range(long, long, long) - Method in class org.apache.spark.sql.SQLContext
-
- range(long, long, long, int) - Method in class org.apache.spark.sql.SQLContext
-
- rangeBetween(long, long) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the frame boundaries defined,
from
start
(inclusive) to
end
(inclusive).
- rangeBetween(Column, Column) - Static method in class org.apache.spark.sql.expressions.Window
-
- rangeBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
Defines the frame boundaries, from start
(inclusive) to end
(inclusive).
- rangeBetween(Column, Column) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
- RangeDependency<T> - Class in org.apache.spark
-
Developer API
Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
- RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
-
- RangePartitioner<K,V> - Class in org.apache.spark
-
A
Partitioner
that partitions sortable records by range into roughly
equal ranges.
- RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, int, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
-
- RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
-
- rank() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- rank() - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- rank() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for rank of the matrix factorization (positive).
- rank() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
The numeric rank of the fitted linear model.
- rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- rank() - Static method in class org.apache.spark.sql.functions
-
Window function: returns the rank of rows within a window partition.
- RankingMetrics<T> - Class in org.apache.spark.mllib.evaluation
-
Evaluator for ranking algorithms.
- RankingMetrics(RDD<Tuple2<Object, Object>>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.evaluation.RankingMetrics
-
- RateEstimator - Interface in org.apache.spark.streaming.scheduler.rate
-
A component that estimates the rate at which an InputDStream
should ingest
records, based on updates at every batch completion.
- Rating(ID, ID, float) - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating
-
- rating() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
-
- Rating - Class in org.apache.spark.mllib.recommendation
-
A more compact class to represent a rating than Tuple3[Int, Int, Double].
- Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
-
- rating() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- Rating$() - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating$
-
- RatingBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlock$
-
- ratingCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for the column name for ratings.
- ratioParam() - Static method in class org.apache.spark.ml.image.SamplePathFilter
-
- raw2ProbabilityInPlace(Vector) - Method in interface org.apache.spark.ml.ann.TopologyModel
-
Probability of the model.
- rawPredictionCol() - Method in interface org.apache.spark.ml.param.shared.HasRawPredictionCol
-
Param for raw prediction (a.k.a.
- rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- RawTextHelper - Class in org.apache.spark.streaming.util
-
- RawTextHelper() - Constructor for class org.apache.spark.streaming.util.RawTextHelper
-
- RawTextSender - Class in org.apache.spark.streaming.util
-
A helper program that sends blocks of Kryo-serialized text strings out on a socket at a
specified rate.
- RawTextSender() - Constructor for class org.apache.spark.streaming.util.RawTextSender
-
- RBackendAuthHandler - Class in org.apache.spark.api.r
-
Authentication handler for connections from the R process.
- RBackendAuthHandler(String) - Constructor for class org.apache.spark.api.r.RBackendAuthHandler
-
- rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- rdd() - Method in class org.apache.spark.api.java.JavaRDD
-
- rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- RDD() - Static method in class org.apache.spark.api.r.RRunnerModes
-
- rdd() - Method in class org.apache.spark.Dependency
-
- rdd() - Method in class org.apache.spark.NarrowDependency
-
- RDD<T> - Class in org.apache.spark.rdd
-
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
- RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
-
- RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
-
Construct an RDD with just a one-to-one dependency on one parent
- rdd() - Method in class org.apache.spark.ShuffleDependency
-
- rdd() - Method in class org.apache.spark.sql.Dataset
-
Represents the content of the Dataset as an RDD
of T
.
- RDD() - Static method in class org.apache.spark.storage.BlockId
-
- RDDBarrier<T> - Class in org.apache.spark.rdd
-
Experimental
Wraps an RDD in a barrier stage, which forces Spark to launch tasks of this stage together.
- RDDBlockId - Class in org.apache.spark.storage
-
- RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
-
- rddBlocks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- rddBlocks() - Method in class org.apache.spark.status.LiveExecutor
-
- rddCleaned(int) - Method in interface org.apache.spark.CleanerListener
-
- RDDDataDistribution - Class in org.apache.spark.status.api.v1
-
- RDDFunctions<T> - Class in org.apache.spark.mllib.rdd
-
Developer API
Machine learning specific RDD functions.
- RDDFunctions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RDDFunctions
-
- rddId() - Method in class org.apache.spark.CleanCheckpoint
-
- rddId() - Method in class org.apache.spark.CleanRDD
-
- rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- rddId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
-
- rddId() - Method in class org.apache.spark.storage.RDDBlockId
-
- rddIds() - Method in class org.apache.spark.status.api.v1.StageData
-
- RDDInfo - Class in org.apache.spark.storage
-
- RDDInfo(int, String, int, StorageLevel, Seq<Object>, String, Option<org.apache.spark.rdd.RDDOperationScope>) - Constructor for class org.apache.spark.storage.RDDInfo
-
- rddInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
-
- rddInfoToJson(RDDInfo) - Static method in class org.apache.spark.util.JsonProtocol
-
- RDDPartitionInfo - Class in org.apache.spark.status.api.v1
-
- RDDPartitionSeq - Class in org.apache.spark.status
-
A custom sequence of partitions based on a mutable linked list.
- RDDPartitionSeq() - Constructor for class org.apache.spark.status.RDDPartitionSeq
-
- rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- rdds() - Method in class org.apache.spark.rdd.UnionRDD
-
- RDDStorageInfo - Class in org.apache.spark.status.api.v1
-
- rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToDatasetHolder(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLImplicits
-
- rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, <any>, <any>) - Static method in class org.apache.spark.rdd.RDD
-
- read() - Method in class org.apache.spark.io.NioBufferedFileInputStream
-
- read(byte[], int, int) - Method in class org.apache.spark.io.NioBufferedFileInputStream
-
- read() - Method in class org.apache.spark.io.ReadAheadInputStream
-
- read(byte[], int, int) - Method in class org.apache.spark.io.ReadAheadInputStream
-
- read() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- read() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- read() - Static method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- read() - Static method in class org.apache.spark.ml.classification.GBTClassifier
-
- read() - Static method in class org.apache.spark.ml.classification.LinearSVC
-
- read() - Static method in class org.apache.spark.ml.classification.LinearSVCModel
-
- read() - Static method in class org.apache.spark.ml.classification.LogisticRegression
-
- read() - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- read() - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
-
- read() - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
- read() - Static method in class org.apache.spark.ml.classification.NaiveBayes
-
- read() - Static method in class org.apache.spark.ml.classification.NaiveBayesModel
-
- read() - Static method in class org.apache.spark.ml.classification.OneVsRest
-
- read() - Static method in class org.apache.spark.ml.classification.OneVsRestModel
-
- read() - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- read() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- read() - Static method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- read() - Static method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
- read() - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
-
- read() - Static method in class org.apache.spark.ml.clustering.GaussianMixture
-
- read() - Static method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
- read() - Static method in class org.apache.spark.ml.clustering.KMeans
-
- read() - Static method in class org.apache.spark.ml.clustering.KMeansModel
-
- read() - Static method in class org.apache.spark.ml.clustering.LDA
-
- read() - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
-
- read() - Static method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
- read() - Static method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- read() - Static method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- read() - Static method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- read() - Static method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- read() - Static method in class org.apache.spark.ml.feature.Binarizer
-
- read() - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
-
- read() - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
-
- read() - Static method in class org.apache.spark.ml.feature.Bucketizer
-
- read() - Static method in class org.apache.spark.ml.feature.ChiSqSelector
-
- read() - Static method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-
- read() - Static method in class org.apache.spark.ml.feature.ColumnPruner
-
- read() - Static method in class org.apache.spark.ml.feature.CountVectorizer
-
- read() - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
-
- read() - Static method in class org.apache.spark.ml.feature.DCT
-
- read() - Static method in class org.apache.spark.ml.feature.ElementwiseProduct
-
- read() - Static method in class org.apache.spark.ml.feature.FeatureHasher
-
- read() - Static method in class org.apache.spark.ml.feature.HashingTF
-
- read() - Static method in class org.apache.spark.ml.feature.IDF
-
- read() - Static method in class org.apache.spark.ml.feature.IDFModel
-
- read() - Static method in class org.apache.spark.ml.feature.Imputer
-
- read() - Static method in class org.apache.spark.ml.feature.ImputerModel
-
- read() - Static method in class org.apache.spark.ml.feature.IndexToString
-
- read() - Static method in class org.apache.spark.ml.feature.Interaction
-
- read() - Static method in class org.apache.spark.ml.feature.MaxAbsScaler
-
- read() - Static method in class org.apache.spark.ml.feature.MaxAbsScalerModel
-
- read() - Static method in class org.apache.spark.ml.feature.MinHashLSH
-
- read() - Static method in class org.apache.spark.ml.feature.MinHashLSHModel
-
- read() - Static method in class org.apache.spark.ml.feature.MinMaxScaler
-
- read() - Static method in class org.apache.spark.ml.feature.MinMaxScalerModel
-
- read() - Static method in class org.apache.spark.ml.feature.NGram
-
- read() - Static method in class org.apache.spark.ml.feature.Normalizer
-
- read() - Static method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
- read() - Static method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
-
- read() - Static method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- read() - Static method in class org.apache.spark.ml.feature.PCA
-
- read() - Static method in class org.apache.spark.ml.feature.PCAModel
-
- read() - Static method in class org.apache.spark.ml.feature.PolynomialExpansion
-
- read() - Static method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- read() - Static method in class org.apache.spark.ml.feature.RegexTokenizer
-
- read() - Static method in class org.apache.spark.ml.feature.RFormula
-
- read() - Static method in class org.apache.spark.ml.feature.RFormulaModel
-
- read() - Static method in class org.apache.spark.ml.feature.SQLTransformer
-
- read() - Static method in class org.apache.spark.ml.feature.StandardScaler
-
- read() - Static method in class org.apache.spark.ml.feature.StandardScalerModel
-
- read() - Static method in class org.apache.spark.ml.feature.StopWordsRemover
-
- read() - Static method in class org.apache.spark.ml.feature.StringIndexer
-
- read() - Static method in class org.apache.spark.ml.feature.StringIndexerModel
-
- read() - Static method in class org.apache.spark.ml.feature.Tokenizer
-
- read() - Static method in class org.apache.spark.ml.feature.VectorAssembler
-
- read() - Static method in class org.apache.spark.ml.feature.VectorAttributeRewriter
-
- read() - Static method in class org.apache.spark.ml.feature.VectorIndexer
-
- read() - Static method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- read() - Static method in class org.apache.spark.ml.feature.VectorSizeHint
-
- read() - Static method in class org.apache.spark.ml.feature.VectorSlicer
-
- read() - Static method in class org.apache.spark.ml.feature.Word2Vec
-
- read() - Static method in class org.apache.spark.ml.feature.Word2VecModel
-
- read() - Static method in class org.apache.spark.ml.fpm.FPGrowth
-
- read() - Static method in class org.apache.spark.ml.fpm.FPGrowthModel
-
- read() - Static method in class org.apache.spark.ml.Pipeline
-
- read() - Static method in class org.apache.spark.ml.PipelineModel
-
- read() - Static method in class org.apache.spark.ml.recommendation.ALS
-
- read() - Static method in class org.apache.spark.ml.recommendation.ALSModel
-
- read() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- read() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- read() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- read() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- read() - Static method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- read() - Static method in class org.apache.spark.ml.regression.GBTRegressor
-
- read() - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
- read() - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
-
- read() - Static method in class org.apache.spark.ml.regression.IsotonicRegression
-
- read() - Static method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-
- read() - Static method in class org.apache.spark.ml.regression.LinearRegression
-
- read() - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- read() - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- read() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- read() - Static method in class org.apache.spark.ml.tuning.CrossValidator
-
- read() - Static method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- read() - Static method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- read() - Static method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
-
- read() - Method in interface org.apache.spark.ml.util.DefaultParamsReadable
-
- read() - Method in interface org.apache.spark.ml.util.MLReadable
-
Returns an MLReader
instance for this class.
- read(ByteBuffer) - Method in class org.apache.spark.security.CryptoStreamUtils.ErrorHandlingReadableChannel
-
- read(Kryo, Input, Class<Iterable<?>>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
-
- read() - Method in class org.apache.spark.sql.SparkSession
-
Returns a
DataFrameReader
that can be used to read non-streaming data in as a
DataFrame
.
- read() - Method in class org.apache.spark.sql.SQLContext
-
- read() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-
- read(byte[]) - Method in class org.apache.spark.storage.BufferReleasingInputStream
-
- read(byte[], int, int) - Method in class org.apache.spark.storage.BufferReleasingInputStream
-
- read(String) - Static method in class org.apache.spark.streaming.CheckpointReader
-
Read checkpoint files present in the given checkpoint directory.
- read(String, SparkConf, Configuration, boolean) - Static method in class org.apache.spark.streaming.CheckpointReader
-
Read checkpoint files present in the given checkpoint directory.
- read(WriteAheadLogRecordHandle) - Method in class org.apache.spark.streaming.util.WriteAheadLog
-
Read a written record based on the given record handle.
- ReadableChannelFileRegion - Class in org.apache.spark.storage
-
- ReadableChannelFileRegion(ReadableByteChannel, long) - Constructor for class org.apache.spark.storage.ReadableChannelFileRegion
-
- ReadAheadInputStream - Class in org.apache.spark.io
-
InputStream
implementation which asynchronously reads ahead from the underlying input
stream when specified amount of data has been read from the current buffer.
- ReadAheadInputStream(InputStream, int) - Constructor for class org.apache.spark.io.ReadAheadInputStream
-
Creates a ReadAheadInputStream
with the specified buffer size and read-ahead
threshold
- readAll() - Method in class org.apache.spark.streaming.util.WriteAheadLog
-
Read and return an iterator of all the records that have been written but not yet cleaned up.
- readArray(DataInputStream, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
-
- readBoolean(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readBooleanArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readBytes(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readBytes() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- readBytesArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readDate(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readDouble(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readDoubleArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
-
- readFrom(ConfigReader) - Method in class org.apache.spark.internal.config.ConfigEntryWithDefault
-
- readFrom(ConfigReader) - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultFunction
-
- readFrom(ConfigReader) - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultString
-
- readFrom(InputStream) - Static method in class org.apache.spark.util.sketch.BloomFilter
-
- readFrom(InputStream) - Static method in class org.apache.spark.util.sketch.CountMinSketch
-
- readFrom(byte[]) - Static method in class org.apache.spark.util.sketch.CountMinSketch
-
- readImages(String) - Static method in class org.apache.spark.ml.image.ImageSchema
-
- readImages(String, SparkSession, boolean, int, boolean, double, long) - Static method in class org.apache.spark.ml.image.ImageSchema
-
- readInt(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readIntArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readKey(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
-
Reads the object representing the key of a key-value pair.
- readList(DataInputStream, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
-
- readMap(DataInputStream, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
-
- readObject(DataInputStream, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
-
- readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
-
The most general-purpose method to read an object.
- readObjectType(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readRecords() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- readSchema(Seq<String>, Option<Configuration>, boolean) - Static method in class org.apache.spark.sql.hive.orc.OrcFileOperator
-
- readSchema() - Method in interface org.apache.spark.sql.sources.v2.reader.DataSourceReader
-
Returns the actual schema of this data source reader, which may be different from the physical
schema of the underlying storage, as column pruning or other optimizations may happen.
- readSqlObject(DataInputStream, char) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- readStream() - Method in class org.apache.spark.sql.SparkSession
-
Returns a DataStreamReader
that can be used to read streaming data in as a DataFrame
.
- readStream() - Method in class org.apache.spark.sql.SQLContext
-
- readString(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readStringArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readStringBytes(DataInputStream, int) - Static method in class org.apache.spark.api.r.SerDe
-
- ReadSupport - Interface in org.apache.spark.sql.sources.v2
-
- readTime(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
-
- readTypedObject(DataInputStream, char, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
-
- readValue(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
-
Reads the object representing the value of a key-value pair.
- ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-
- ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
-
Blocks until this action completes.
- ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-
- reason() - Method in class org.apache.spark.ExecutorLostFailure
-
- reason() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
-
- reason() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
-
- reason() - Method in class org.apache.spark.scheduler.local.KillTask
-
- reason() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- reason() - Method in class org.apache.spark.TaskKilled
-
- reason() - Method in exception org.apache.spark.TaskKilledException
-
- Recall - Class in org.apache.spark.mllib.evaluation.binary
-
Recall.
- Recall() - Constructor for class org.apache.spark.mllib.evaluation.binary.Recall
-
- recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns recall for a given label (category)
- recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
- recall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns document-based recall averaged by the number of documents
- recall(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns recall for a given label (category)
- recallByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Returns recall for each label (category).
- recallByThreshold() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-
Returns a dataframe with two fields (threshold, recall) curve.
- recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, recall) curve.
- receive() - Method in interface org.apache.spark.rpc.RpcEndpoint
-
Process messages from RpcEndpointRef.send
or RpcCallContext.reply
.
- receiveAndReply(RpcCallContext) - Method in interface org.apache.spark.rpc.RpcEndpoint
-
Process messages from RpcEndpointRef.ask
.
- ReceivedBlock - Interface in org.apache.spark.streaming.receiver
-
Trait representing a received block
- ReceivedBlockHandler - Interface in org.apache.spark.streaming.receiver
-
Trait that represents a class that handles the storage of blocks received by receiver
- ReceivedBlockStoreResult - Interface in org.apache.spark.streaming.receiver
-
Trait that represents the metadata related to storage of blocks
- ReceivedBlockTrackerLogEvent - Interface in org.apache.spark.streaming.scheduler
-
Trait representing any event in the ReceivedBlockTracker that updates its state.
- Receiver<T> - Class in org.apache.spark.streaming.receiver
-
Developer API
Abstract class of a receiver that can be run on worker nodes to receive external data.
- Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
-
- RECEIVER_WAL_CLASS_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- RECEIVER_WAL_CLOSE_AFTER_WRITE_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- RECEIVER_WAL_ENABLE_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- RECEIVER_WAL_MAX_FAILURES_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- RECEIVER_WAL_ROLLING_INTERVAL_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- ReceiverInfo - Class in org.apache.spark.status.api.v1.streaming
-
- ReceiverInfo - Class in org.apache.spark.streaming.scheduler
-
Developer API
Class having information about a receiver
- ReceiverInfo(int, String, boolean, String, String, String, String, long) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
-
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
Abstract class for defining any
InputDStream
that has to start a receiver on worker nodes to receive external data.
- ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- ReceiverMessage - Interface in org.apache.spark.streaming.receiver
-
Messages sent to the Receiver.
- ReceiverState - Class in org.apache.spark.streaming.scheduler
-
Enumeration to identify current state of a Receiver
- ReceiverState() - Constructor for class org.apache.spark.streaming.scheduler.ReceiverState
-
- receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented receiver.
- receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream with any arbitrary user implemented receiver.
- ReceiverTrackerLocalMessage - Interface in org.apache.spark.streaming.scheduler
-
Messages used by the driver and ReceiverTrackerEndpoint to communicate locally.
- ReceiverTrackerMessage - Interface in org.apache.spark.streaming.scheduler
-
Messages used by the NetworkReceiver and the ReceiverTracker to communicate
with each other.
- recentProgress() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
- recommendForAllItems(int) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
Returns top numUsers
users recommended for each item, for all items.
- recommendForAllUsers(int) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
Returns top numItems
items recommended for each user, for all users.
- recommendForItemSubset(Dataset<?>, int) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
Returns top numUsers
users recommended for each item id in the input data set.
- recommendForUserSubset(Dataset<?>, int) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
Returns top numItems
items recommended for each user id in the input data set.
- recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends products to a user.
- recommendProductsForUsers(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends top products for all users.
- recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends users to a product.
- recommendUsersForProducts(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends top users for all products.
- recordReader(InputStream, Configuration) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- recordReaderClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.HadoopRDD
-
Update the input bytes read metric each time this number of records has been read
- RECORDS_READ() - Method in class org.apache.spark.InternalAccumulator.input$
-
- RECORDS_READ() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
-
- RECORDS_WRITTEN() - Method in class org.apache.spark.InternalAccumulator.output$
-
- RECORDS_WRITTEN() - Method in class org.apache.spark.InternalAccumulator.shuffleWrite$
-
- recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetricDistributions
-
- recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetrics
-
- recordsRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetricDistributions
-
- recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetrics
-
- recordsWritten() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
-
- recordWriter(OutputStream, Configuration) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- recordWriterClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- recoverPartitions(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Recovers all the partitions in the directory of a table and update the catalog.
- RecursiveFlag - Class in org.apache.spark.ml.image
-
- RecursiveFlag() - Constructor for class org.apache.spark.ml.image.RecursiveFlag
-
- recursiveList(File) - Static method in class org.apache.spark.TestUtils
-
Lists files recursively.
- redact(SparkConf, Seq<Tuple2<String, String>>) - Static method in class org.apache.spark.util.Utils
-
Redact the sensitive values in the given map.
- redact(Option<Regex>, Seq<Tuple2<String, String>>) - Static method in class org.apache.spark.util.Utils
-
Redact the sensitive values in the given map.
- redact(Option<Regex>, String) - Static method in class org.apache.spark.util.Utils
-
Redact the sensitive information in the given string.
- redact(Map<String, String>) - Static method in class org.apache.spark.util.Utils
-
Looks up the redaction regex from within the key value pairs and uses it to redact the rest
of the key value pairs.
- REDIRECT_CONNECTOR_NAME() - Static method in class org.apache.spark.ui.JettyUtils
-
- redirectableStream() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
-
- redirectError() - Method in class org.apache.spark.launcher.SparkLauncher
-
Specifies that stderr in spark-submit should be redirected to stdout.
- redirectError(ProcessBuilder.Redirect) - Method in class org.apache.spark.launcher.SparkLauncher
-
Redirects error output to the specified Redirect.
- redirectError(File) - Method in class org.apache.spark.launcher.SparkLauncher
-
Redirects error output to the specified File.
- redirectOutput(ProcessBuilder.Redirect) - Method in class org.apache.spark.launcher.SparkLauncher
-
Redirects standard output to the specified Redirect.
- redirectOutput(File) - Method in class org.apache.spark.launcher.SparkLauncher
-
Redirects error output to the specified File.
- redirectToLog(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Sets all output to be logged and redirected to a logger with the specified name.
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Reduces the elements of this RDD using the specified commutative and associative binary
operator.
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
-
Reduces the elements of this RDD using the specified commutative and
associative binary operator.
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Scala-specific)
Reduces the elements of this Dataset using the specified binary function.
- reduce(ReduceFunction<T>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
(Java-specific)
Reduces the elements of this Dataset using the specified binary function.
- reduce(BUF, IN) - Method in class org.apache.spark.sql.expressions.Aggregator
-
Combine two values to produce a new value.
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing each RDD
of this DStream.
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing each RDD
of this DStream.
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative and commutative reduce function.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative and commutative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative and commutative reduce function.
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative and commutative reduce function.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative and commutative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative and commutative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Create a new DStream by applying reduceByKey
over a sliding window on this
DStream.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by reducing over a using incremental computation.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window on this
DStream.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative and commutative reduce function, but return
the result immediately to the master as a Map.
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative and commutative reduce function, but return
the results immediately to the master as a Map.
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- ReduceFunction<T> - Interface in org.apache.spark.api.java.function
-
Base interface for function used in Dataset's reduce.
- reduceGroups(Function2<V, V, V>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
(Scala-specific)
Reduces the elements of each group of data using the specified binary function.
- reduceGroups(ReduceFunction<V>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
(Java-specific)
Reduces the elements of each group of data using the specified binary function.
- reduceId() - Method in class org.apache.spark.FetchFailed
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- references() - Method in class org.apache.spark.sql.sources.And
-
- references() - Method in class org.apache.spark.sql.sources.EqualNullSafe
-
- references() - Method in class org.apache.spark.sql.sources.EqualTo
-
- references() - Method in class org.apache.spark.sql.sources.Filter
-
List of columns that are referenced by this filter.
- references() - Method in class org.apache.spark.sql.sources.GreaterThan
-
- references() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- references() - Method in class org.apache.spark.sql.sources.In
-
- references() - Method in class org.apache.spark.sql.sources.IsNotNull
-
- references() - Method in class org.apache.spark.sql.sources.IsNull
-
- references() - Method in class org.apache.spark.sql.sources.LessThan
-
- references() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
-
- references() - Method in class org.apache.spark.sql.sources.Not
-
- references() - Method in class org.apache.spark.sql.sources.Or
-
- references() - Method in class org.apache.spark.sql.sources.StringContains
-
- references() - Method in class org.apache.spark.sql.sources.StringEndsWith
-
- references() - Method in class org.apache.spark.sql.sources.StringStartsWith
-
- refreshByPath(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Invalidates and refreshes all the cached data (and the associated metadata) for any Dataset
that contains the given data source path.
- refreshTable(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Invalidates and refreshes all the cached data and metadata of the given table.
- refreshTable(String) - Method in class org.apache.spark.sql.hive.HiveContext
-
Deprecated.
Invalidate and refresh all the cached the metadata of the given table.
- regex(Regex) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- regexFromString(String, String) - Static method in class org.apache.spark.internal.config.ConfigHelpers
-
- regexp_extract(Column, String, int) - Static method in class org.apache.spark.sql.functions
-
Extract a specific group matched by a Java regex, from the specified string column.
- regexp_replace(Column, String, String) - Static method in class org.apache.spark.sql.functions
-
Replace all substrings of the specified string value that match regexp with rep.
- regexp_replace(Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Replace all substrings of the specified string value that match regexp with rep.
- RegexTokenizer - Class in org.apache.spark.ml.feature
-
A regex based tokenizer that extracts tokens either by using the provided regex pattern to split
the text (default) or repeatedly matching the regex (if gaps
is false).
- RegexTokenizer(String) - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
-
- RegexTokenizer() - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
-
- register(AccumulatorV2<?, ?>) - Method in class org.apache.spark.SparkContext
-
Register the given accumulator.
- register(AccumulatorV2<?, ?>, String) - Method in class org.apache.spark.SparkContext
-
Register the given accumulator with given name.
- register(String, String) - Static method in class org.apache.spark.sql.types.UDTRegistration
-
Registers an UserDefinedType to an user class.
- register(String, UserDefinedAggregateFunction) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a user-defined aggregate function (UDAF).
- register(String, UserDefinedFunction) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a user-defined function (UDF), for a UDF that's already defined using the Dataset
API (i.e.
- register(String, Function0<RT>, TypeTags.TypeTag<RT>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 0 arguments as user-defined function (UDF).
- register(String, Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 1 arguments as user-defined function (UDF).
- register(String, Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 2 arguments as user-defined function (UDF).
- register(String, Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 3 arguments as user-defined function (UDF).
- register(String, Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 4 arguments as user-defined function (UDF).
- register(String, Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 5 arguments as user-defined function (UDF).
- register(String, Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 6 arguments as user-defined function (UDF).
- register(String, Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 7 arguments as user-defined function (UDF).
- register(String, Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 8 arguments as user-defined function (UDF).
- register(String, Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 9 arguments as user-defined function (UDF).
- register(String, Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 10 arguments as user-defined function (UDF).
- register(String, Function11<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 11 arguments as user-defined function (UDF).
- register(String, Function12<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 12 arguments as user-defined function (UDF).
- register(String, Function13<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 13 arguments as user-defined function (UDF).
- register(String, Function14<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 14 arguments as user-defined function (UDF).
- register(String, Function15<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 15 arguments as user-defined function (UDF).
- register(String, Function16<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 16 arguments as user-defined function (UDF).
- register(String, Function17<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 17 arguments as user-defined function (UDF).
- register(String, Function18<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 18 arguments as user-defined function (UDF).
- register(String, Function19<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 19 arguments as user-defined function (UDF).
- register(String, Function20<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 20 arguments as user-defined function (UDF).
- register(String, Function21<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 21 arguments as user-defined function (UDF).
- register(String, Function22<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>, TypeTags.TypeTag<A22>) - Method in class org.apache.spark.sql.UDFRegistration
-
Registers a deterministic Scala closure of 22 arguments as user-defined function (UDF).
- register(String, UDF0<?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF0 instance as user-defined function (UDF).
- register(String, UDF1<?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF1 instance as user-defined function (UDF).
- register(String, UDF2<?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF2 instance as user-defined function (UDF).
- register(String, UDF3<?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF3 instance as user-defined function (UDF).
- register(String, UDF4<?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF4 instance as user-defined function (UDF).
- register(String, UDF5<?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF5 instance as user-defined function (UDF).
- register(String, UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF6 instance as user-defined function (UDF).
- register(String, UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF7 instance as user-defined function (UDF).
- register(String, UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF8 instance as user-defined function (UDF).
- register(String, UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF9 instance as user-defined function (UDF).
- register(String, UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF10 instance as user-defined function (UDF).
- register(String, UDF11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF11 instance as user-defined function (UDF).
- register(String, UDF12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF12 instance as user-defined function (UDF).
- register(String, UDF13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF13 instance as user-defined function (UDF).
- register(String, UDF14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF14 instance as user-defined function (UDF).
- register(String, UDF15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF15 instance as user-defined function (UDF).
- register(String, UDF16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF16 instance as user-defined function (UDF).
- register(String, UDF17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF17 instance as user-defined function (UDF).
- register(String, UDF18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF18 instance as user-defined function (UDF).
- register(String, UDF19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF19 instance as user-defined function (UDF).
- register(String, UDF20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF20 instance as user-defined function (UDF).
- register(String, UDF21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF21 instance as user-defined function (UDF).
- register(String, UDF22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a deterministic Java UDF22 instance as user-defined function (UDF).
- register(QueryExecutionListener) - Method in class org.apache.spark.sql.util.ExecutionListenerManager
-
- register(AccumulatorV2<?, ?>) - Static method in class org.apache.spark.util.AccumulatorContext
-
Registers an
AccumulatorV2
created on the driver such that it can be used on the executors.
- register(String, Function0<Object>) - Static method in class org.apache.spark.util.SignalUtils
-
Adds an action to be run when a given signal is received by this process.
- registerAvroSchemas(Seq<Schema>) - Method in class org.apache.spark.SparkConf
-
Use Kryo serialization and register the given set of Avro schemas so that the generic
record serializer can decrease network IO
- RegisterBlockManager(BlockManagerId, long, long, org.apache.spark.rpc.RpcEndpointRef) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
-
- RegisterBlockManager$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
-
- registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
-
- RegisterClusterManager(org.apache.spark.rpc.RpcEndpointRef) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager
-
- RegisterClusterManager$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
-
- registerDialect(JdbcDialect) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects
-
Register a dialect for use on all new matching jdbc org.apache.spark.sql.DataFrame
.
- RegisteredExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
-
- RegisterExecutor(String, org.apache.spark.rpc.RpcEndpointRef, String, int, Map<String, String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
-
- RegisterExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
-
- RegisterExecutorFailed(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
-
- RegisterExecutorFailed$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
-
- registerKryoClasses(SparkConf) - Static method in class org.apache.spark.graphx.GraphXUtils
-
Registers classes that GraphX uses with Kryo.
- registerKryoClasses(SparkContext) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette
-
- registerKryoClasses(Class<?>[]) - Method in class org.apache.spark.SparkConf
-
Use Kryo serialization and register the given set of classes with Kryo.
- registerLogger(Logger) - Static method in class org.apache.spark.util.SignalUtils
-
Register a signal handler to log signals on UNIX-like systems.
- registerShutdownDeleteDir(File) - Static method in class org.apache.spark.util.ShutdownHookManager
-
- registerStream(DStream<BinarySample>) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-
Register a DStream
of values for significance testing.
- registerStream(JavaDStream<BinarySample>) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-
Register a JavaDStream
of values for significance testing.
- registerTempTable(String) - Method in class org.apache.spark.sql.Dataset
-
- regParam() - Method in interface org.apache.spark.ml.optim.loss.DifferentiableRegularization
-
Magnitude of the regularization penalty.
- regParam() - Method in interface org.apache.spark.ml.param.shared.HasRegParam
-
Param for regularization parameter (>= 0).
- Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- RegressionEvaluator - Class in org.apache.spark.ml.evaluation
-
Experimental
Evaluator for regression, which expects two input columns: prediction and label.
- RegressionEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- RegressionEvaluator() - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- RegressionMetrics - Class in org.apache.spark.mllib.evaluation
-
Evaluator for regression.
- RegressionMetrics(RDD<Tuple2<Object, Object>>, boolean) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
-
- RegressionMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
-
- RegressionModel<FeaturesType,M extends RegressionModel<FeaturesType,M>> - Class in org.apache.spark.ml.regression
-
Developer API
- RegressionModel() - Constructor for class org.apache.spark.ml.regression.RegressionModel
-
- RegressionModel - Interface in org.apache.spark.mllib.regression
-
- reindex() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- reindex() - Method in class org.apache.spark.graphx.VertexRDD
-
Construct a new VertexRDD that is indexed by only the visible vertices.
- RelationalGroupedDataset - Class in org.apache.spark.sql
-
A set of methods for aggregations on a
DataFrame
, created by
groupBy
,
cube
or
rollup
(and also
pivot
).
- RelationalGroupedDataset.CubeType$ - Class in org.apache.spark.sql
-
To indicate it's the CUBE
- RelationalGroupedDataset.GroupByType$ - Class in org.apache.spark.sql
-
To indicate it's the GroupBy
- RelationalGroupedDataset.GroupType - Interface in org.apache.spark.sql
-
The Grouping Type
- RelationalGroupedDataset.PivotType$ - Class in org.apache.spark.sql
-
- RelationalGroupedDataset.RollupType$ - Class in org.apache.spark.sql
-
To indicate it's the ROLLUP
- RelationConversions - Class in org.apache.spark.sql.hive
-
Relation conversion from metastore relations to data source relations for better performance
- RelationConversions(SQLConf, HiveSessionCatalog) - Constructor for class org.apache.spark.sql.hive.RelationConversions
-
- RelationProvider - Interface in org.apache.spark.sql.sources
-
Implemented by objects that produce relations for a specific kind of data source.
- relativeDirection(long) - Method in class org.apache.spark.graphx.Edge
-
Return the relative direction of the edge to the corresponding
vertex.
- relativeError() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase
-
Relative error (see documentation for
org.apache.spark.sql.DataFrameStatFunctions.approxQuantile
for description)
Must be in the range [0, 1].
- relativeError() - Method in class org.apache.spark.util.sketch.CountMinSketch
-
- rem(Decimal, Decimal) - Method in class org.apache.spark.sql.types.Decimal.DecimalAsIfIntegral$
-
- remainder(Decimal) - Method in class org.apache.spark.sql.types.Decimal
-
- remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Sets each DStreams in this context to remember RDDs it generated in the last given duration.
- remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext
-
Set each DStream in this context to remember RDDs it generated in the last given duration.
- REMOTE_BLOCKS_FETCHED() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
-
- REMOTE_BYTES_READ() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
-
- REMOTE_BYTES_READ_TO_DISK() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
-
- remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- remoteBytesReadToDisk() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- remoteBytesReadToDisk() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- remove(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
-
Removes a key from this map and returns its value associated previously as an option.
- remove(String) - Method in class org.apache.spark.SparkConf
-
Remove a parameter from the configuration
- remove() - Method in interface org.apache.spark.sql.streaming.GroupState
-
Remove this state.
- remove(String) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
- remove() - Method in class org.apache.spark.streaming.State
-
Remove the state if it exists.
- remove(long) - Static method in class org.apache.spark.util.AccumulatorContext
-
- RemoveBlock(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
-
- RemoveBlock$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
-
- RemoveBroadcast(long, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
-
- RemoveBroadcast$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
-
- removeDistribution(LiveExecutor) - Method in class org.apache.spark.status.LiveRDD
-
- RemoveExecutor(String, org.apache.spark.scheduler.ExecutorLossReason) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
-
- RemoveExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
-
- RemoveExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
-
- RemoveExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
-
- removeFromDriver() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
-
- removeListener(StreamingQueryListener) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
-
- removeListener(L) - Method in interface org.apache.spark.util.ListenerBus
-
Remove a listener and it won't receive any events.
- removeListenerOnError(SparkListenerInterface) - Method in class org.apache.spark.scheduler.AsyncEventQueue
-
- removeListenerOnError(L) - Method in interface org.apache.spark.util.ListenerBus
-
This can be overridden by subclasses if there is any extra cleanup to do when removing a
listener.
- removeMapOutput(int, BlockManagerId) - Method in class org.apache.spark.ShuffleStatus
-
Remove the map output which was served by the specified block manager.
- removeOutputsByFilter(Function1<BlockManagerId, Object>) - Method in class org.apache.spark.ShuffleStatus
-
Removes all shuffle outputs which satisfies the filter.
- removeOutputsOnExecutor(String) - Method in class org.apache.spark.ShuffleStatus
-
Removes all map outputs associated with the specified executor.
- removeOutputsOnHost(String) - Method in class org.apache.spark.ShuffleStatus
-
Removes all shuffle outputs associated with this host.
- removePartition(String) - Method in class org.apache.spark.status.LiveRDD
-
- removePartition(LiveRDDPartition) - Method in class org.apache.spark.status.RDDPartitionSeq
-
- RemoveRdd(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
-
- RemoveRdd$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
-
- removeReason() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- removeReason() - Method in class org.apache.spark.status.LiveExecutor
-
- removeSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
-
- removeSelfEdges() - Method in class org.apache.spark.graphx.GraphOps
-
Remove self edges.
- RemoveShuffle(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
-
- RemoveShuffle$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
-
- removeShutdownDeleteDir(File) - Static method in class org.apache.spark.util.ShutdownHookManager
-
- removeShutdownHook(Object) - Static method in class org.apache.spark.util.ShutdownHookManager
-
Remove a previously installed shutdown hook.
- removeSparkListener(SparkListenerInterface) - Method in class org.apache.spark.SparkContext
-
Developer API
Deregister the listener from Spark's listener bus.
- removeStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
-
- removeTime() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- removeTime() - Method in class org.apache.spark.status.LiveExecutor
-
- RemoveWorker(String, String, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker
-
- RemoveWorker$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker$
-
- renameFunction(String, String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Rename an existing function in the database.
- renamePartitions(String, String, Seq<Map<String, String>>, Seq<Map<String, String>>) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Rename one or many existing table partitions, assuming they exist.
- rep(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- rep1(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- rep1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- rep1sep(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int, Column...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset partitioned by the given partitioning expressions into
numPartitions
.
- repartition(Column...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset partitioned by the given partitioning expressions, using
spark.sql.shuffle.partitions
as number of partitions.
- repartition(int) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset that has exactly numPartitions
partitions.
- repartition(int, Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset partitioned by the given partitioning expressions into
numPartitions
.
- repartition(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset partitioned by the given partitioning expressions, using
spark.sql.shuffle.partitions
as number of partitions.
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Repartition the RDD according to the given partitioner and, within each resulting partition,
sort records by their keys.
- repartitionAndSortWithinPartitions(Partitioner, Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Repartition the RDD according to the given partitioner and, within each resulting partition,
sort records by their keys.
- repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
-
Repartition the RDD according to the given partitioner and, within each resulting partition,
sort records by their keys.
- repartitionByRange(int, Column...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset partitioned by the given partitioning expressions into
numPartitions
.
- repartitionByRange(Column...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset partitioned by the given partitioning expressions, using
spark.sql.shuffle.partitions
as number of partitions.
- repartitionByRange(int, Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset partitioned by the given partitioning expressions into
numPartitions
.
- repartitionByRange(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset partitioned by the given partitioning expressions, using
spark.sql.shuffle.partitions
as number of partitions.
- repeat(Column, int) - Static method in class org.apache.spark.sql.functions
-
Repeats a string column n times, and returns it as a new string column.
- replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Replaces values matching keys in replacement
map with the corresponding values.
- replace(String[], Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Replaces values matching keys in replacement
map with the corresponding values.
- replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Replaces values matching keys in replacement
map.
- replace(Seq<String>, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Replaces values matching keys in replacement
map.
- replaceCharType(DataType) - Static method in class org.apache.spark.sql.types.HiveStringType
-
- replicas() - Method in class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock
-
- ReplicateBlock(BlockId, Seq<BlockManagerId>, int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock
-
- ReplicateBlock$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock$
-
- replicatedVertexView() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- replication() - Method in class org.apache.spark.storage.StorageLevel
-
- reply(Object) - Method in interface org.apache.spark.rpc.RpcCallContext
-
Reply a message to the sender.
- repN(int, Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- report() - Method in interface org.apache.spark.metrics.sink.Sink
-
- reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Report exceptions in receiving data.
- repsep(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- requestedTotal() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
-
- requestExecutors(int) - Method in interface org.apache.spark.ExecutorAllocationClient
-
Request an additional number of executors from the cluster manager.
- RequestExecutors(int, int, Map<String, Object>, Set<String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
-
- requestExecutors(int) - Method in class org.apache.spark.SparkContext
-
Developer API
Request an additional number of executors from the cluster manager.
- RequestExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
-
- requestTotalExecutors(int, int, Map<String, Object>) - Method in interface org.apache.spark.ExecutorAllocationClient
-
Update the cluster manager on our scheduling needs.
- requestTotalExecutors(int, int, Map<String, Object>) - Method in class org.apache.spark.SparkContext
-
Update the cluster manager on our scheduling needs.
- res() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- reservoirSampleAndCount(Iterator<T>, int, long, ClassTag<T>) - Static method in class org.apache.spark.util.random.SamplingUtils
-
Reservoir sampling implementation that also returns the input size.
- reset() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
Resets the values of all metrics to zero.
- reset() - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Used for testing only.
- reset() - Method in class org.apache.spark.storage.BufferReleasingInputStream
-
- reset() - Method in class org.apache.spark.util.AccumulatorV2
-
Resets this accumulator, which is zero value.
- reset() - Method in class org.apache.spark.util.CollectionAccumulator
-
- reset() - Method in class org.apache.spark.util.DoubleAccumulator
-
- reset() - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
-
- reset() - Method in class org.apache.spark.util.LongAccumulator
-
- resetTerminated() - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
-
Forget about past terminated queries so that awaitAnyTermination()
can be used again to
wait for new terminations.
- residualDegreeOfFreedom() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
The residual degrees of freedom.
- residualDegreeOfFreedomNull() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
The residual degrees of freedom for the null model.
- residuals() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
Get the default residuals (deviance residuals) of the fitted model.
- residuals(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
-
Get the residuals of the fitted model by type.
- residuals() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
Residuals (label - predicted value)
- ResolveHiveSerdeTable - Class in org.apache.spark.sql.hive
-
Determine the database, serde/format and schema of the Hive serde table, according to the storage
properties.
- ResolveHiveSerdeTable(SparkSession) - Constructor for class org.apache.spark.sql.hive.ResolveHiveSerdeTable
-
- resolveURI(String) - Static method in class org.apache.spark.util.Utils
-
Return a well-formed URI for the file described by a user input string.
- resolveURIs(String) - Static method in class org.apache.spark.util.Utils
-
Resolve a comma-separated list of paths.
- responder() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
-
- responseFromBackup(String) - Static method in class org.apache.spark.util.Utils
-
Return true if the response message is sent from a backup Master on standby.
- restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- ResubmitFailedStages - Class in org.apache.spark.scheduler
-
- ResubmitFailedStages() - Constructor for class org.apache.spark.scheduler.ResubmitFailedStages
-
- Resubmitted - Class in org.apache.spark
-
Developer API
A org.apache.spark.scheduler.ShuffleMapTask
that completed successfully earlier, but we
lost the executor before the stage completed.
- Resubmitted() - Constructor for class org.apache.spark.Resubmitted
-
- result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-
- result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
-
Awaits and returns the result (of type T) of this action.
- result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-
- RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.InternalAccumulator
-
- RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- RESULT_SIZE() - Static method in class org.apache.spark.InternalAccumulator
-
- RESULT_SIZE() - Static method in class org.apache.spark.status.TaskIndexNames
-
- resultFetchStart() - Method in class org.apache.spark.status.api.v1.TaskData
-
- resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
-
- resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- RetrieveLastAllocatedExecutorId$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveLastAllocatedExecutorId$
-
- RetrieveSparkAppConfig$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkAppConfig$
-
- retryWaitMs(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
-
Returns the configured number of milliseconds to wait on each retry
- ReturnStatementFinder - Class in org.apache.spark.util
-
- ReturnStatementFinder(Option<String>) - Constructor for class org.apache.spark.util.ReturnStatementFinder
-
- reverse() - Method in class org.apache.spark.graphx.EdgeDirection
-
Reverse the direction of an edge.
- reverse() - Method in class org.apache.spark.graphx.EdgeRDD
-
Reverse all the edges in this RDD.
- reverse() - Method in class org.apache.spark.graphx.Graph
-
Reverses all edges in the graph.
- reverse() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- reverse() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- reverse(Column) - Static method in class org.apache.spark.sql.functions
-
Returns a reversed string or an array with reverse order of elements.
- reverseRoutingTables() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- reverseRoutingTables() - Method in class org.apache.spark.graphx.VertexRDD
-
Returns a new
VertexRDD
reflecting a reversal of all edge directions in the corresponding
EdgeRDD
.
- ReviveOffers - Class in org.apache.spark.scheduler.local
-
- ReviveOffers() - Constructor for class org.apache.spark.scheduler.local.ReviveOffers
-
- reviveOffers() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- ReviveOffers$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
-
- RFormula - Class in org.apache.spark.ml.feature
-
Experimental
Implements the transforms required for fitting a dataset against an R model formula.
- RFormula(String) - Constructor for class org.apache.spark.ml.feature.RFormula
-
- RFormula() - Constructor for class org.apache.spark.ml.feature.RFormula
-
- RFormulaBase - Interface in org.apache.spark.ml.feature
-
- RFormulaModel - Class in org.apache.spark.ml.feature
-
- RFormulaParser - Class in org.apache.spark.ml.feature
-
Limited implementation of R formula parsing.
- RFormulaParser() - Constructor for class org.apache.spark.ml.feature.RFormulaParser
-
- RidgeRegressionModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using RidgeRegression.
- RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
Train a regression model with L2-regularization using Stochastic Gradient Descent.
- RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
- right() - Method in class org.apache.spark.sql.sources.And
-
- right() - Method in class org.apache.spark.sql.sources.Or
-
- rightCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit
-
Get sorted categories which split to the right
- rightChild() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
-
- rightChild() - Method in class org.apache.spark.ml.tree.InternalNode
-
- rightChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the index of the right child of this node.
- rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
-
- rightNodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- rint(Column) - Static method in class org.apache.spark.sql.functions
-
Returns the double value that is closest in value to the argument and
is equal to a mathematical integer.
- rint(String) - Static method in class org.apache.spark.sql.functions
-
Returns the double value that is closest in value to the argument and
is equal to a mathematical integer.
- rlike(String) - Method in class org.apache.spark.sql.Column
-
SQL RLIKE expression (LIKE with Regex).
- RMATa() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- RMATb() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- RMATc() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- RMATd() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- rmatGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
A random graph generator using the R-MAT model, proposed in
"R-MAT: A Recursive Model for Graph Mining" by Chakrabarti et al.
- rnd() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- roc() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-
Returns the receiver operating characteristic (ROC) curve,
which is a Dataframe having two fields (FPR, TPR)
with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
- roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the receiver operating characteristic (ROC) curve,
which is an RDD of (false positive rate, true positive rate)
with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
- rolledOver() - Method in interface org.apache.spark.util.logging.RollingPolicy
-
Notify that rollover has occurred
- RollingPolicy - Interface in org.apache.spark.util.logging
-
Defines the policy based on which RollingFileAppender
will
generate rolling files.
- rollup(Column...) - Method in class org.apache.spark.sql.Dataset
-
Create a multi-dimensional rollup for the current Dataset using the specified columns,
so we can run aggregation on them.
- rollup(String, String...) - Method in class org.apache.spark.sql.Dataset
-
Create a multi-dimensional rollup for the current Dataset using the specified columns,
so we can run aggregation on them.
- rollup(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Create a multi-dimensional rollup for the current Dataset using the specified columns,
so we can run aggregation on them.
- rollup(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Create a multi-dimensional rollup for the current Dataset using the specified columns,
so we can run aggregation on them.
- RollupType$() - Constructor for class org.apache.spark.sql.RelationalGroupedDataset.RollupType$
-
- rootMeanSquaredError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
Returns the root mean squared error, which is defined as the square root of
the mean squared error.
- rootMeanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns the root mean squared error, which is defined as the square root of
the mean squared error.
- rootNode() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- rootNode() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- rootNode() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel
-
Root of the decision tree
- rootPool() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
-
- rootPool() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- round(Column) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the column e
rounded to 0 decimal places with HALF_UP round mode.
- round(Column, int) - Static method in class org.apache.spark.sql.functions
-
Round the value of e
to scale
decimal places with HALF_UP round mode
if scale
is greater than or equal to 0 or at integral part when scale
is less than 0.
- ROUND_CEILING() - Static method in class org.apache.spark.sql.types.Decimal
-
- ROUND_FLOOR() - Static method in class org.apache.spark.sql.types.Decimal
-
- ROUND_HALF_EVEN() - Static method in class org.apache.spark.sql.types.Decimal
-
- ROUND_HALF_UP() - Static method in class org.apache.spark.sql.types.Decimal
-
- ROW() - Static method in class org.apache.spark.api.r.SerializationFormats
-
- Row - Interface in org.apache.spark.sql
-
Represents one row of output from a relational operator.
- row(T) - Method in interface org.apache.spark.ui.PagedTable
-
- row_number() - Static method in class org.apache.spark.sql.functions
-
Window function: returns a sequential number starting at 1 within a window partition.
- RowFactory - Class in org.apache.spark.sql
-
A factory class used to construct
Row
objects.
- RowFactory() - Constructor for class org.apache.spark.sql.RowFactory
-
- rowIndices() - Method in class org.apache.spark.ml.linalg.SparseMatrix
-
- rowIndices() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- rowIter() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Returns an iterator of row vectors.
- rowIter() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Returns an iterator of row vectors.
- rowIterator() - Method in class org.apache.spark.sql.vectorized.ColumnarBatch
-
Returns an iterator over the rows in this batch.
- RowMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
Represents a row-oriented distributed Matrix with no meaningful row indices.
- RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- rowsBetween(long, long) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the frame boundaries defined,
from
start
(inclusive) to
end
(inclusive).
- rowsBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
Defines the frame boundaries, from start
(inclusive) to end
(inclusive).
- rowsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- rPackages() - Static method in class org.apache.spark.api.r.RUtils
-
- rpad(Column, int, String) - Static method in class org.apache.spark.sql.functions
-
Right-pad the string column with pad to a length of len.
- RpcCallContext - Interface in org.apache.spark.rpc
-
A callback that
RpcEndpoint
can use to send back a message or failure.
- RpcEndpoint - Interface in org.apache.spark.rpc
-
An end point for the RPC that defines what functions to trigger given a message.
- rpcEnv() - Method in interface org.apache.spark.rpc.RpcEndpoint
-
- RpcEnvFactory - Interface in org.apache.spark.rpc
-
A factory class to create the RpcEnv
.
- RpcEnvFileServer - Interface in org.apache.spark.rpc
-
A server used by the RpcEnv to server files to other processes owned by the application.
- RpcUtils - Class in org.apache.spark.util
-
- RpcUtils() - Constructor for class org.apache.spark.util.RpcUtils
-
- RRDD<T> - Class in org.apache.spark.api.r
-
An RDD that stores serialized R objects as Array[Byte].
- RRDD(RDD<T>, byte[], String, String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.RRDD
-
- RRunnerModes - Class in org.apache.spark.api.r
-
- RRunnerModes() - Constructor for class org.apache.spark.api.r.RRunnerModes
-
- rtrim(Column) - Static method in class org.apache.spark.sql.functions
-
Trim the spaces from right end for the specified string value.
- rtrim(Column, String) - Static method in class org.apache.spark.sql.functions
-
Trim the specified character string from right end for the specified string column.
- ruleName() - Static method in class org.apache.spark.sql.hive.HiveAnalysis
-
- run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents
-
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
- run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents
-
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
- run(Graph<VD, ED>, int, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.LabelPropagation
-
Run static Label Propagation for detecting communities in networks.
- run(Graph<VD, ED>, int, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run PageRank for a fixed number of iterations returning a graph
with vertex attributes containing the PageRank and edge
attributes the normalized edge weight.
- run(Graph<VD, ED>, Seq<Object>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ShortestPaths
-
Computes shortest paths to the given set of landmark vertices.
- run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.StronglyConnectedComponents
-
Compute the strongly connected component (SCC) of each vertex and return a graph with the
vertex value containing the lowest vertex id in the SCC containing that vertex.
- run(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
-
Implement SVD++ based on "Factorization Meets the Neighborhood:
a Multifaceted Collaborative Filtering Model",
available at
here.
- run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
-
- run(RDD<LabeledPoint>, BoostingStrategy, long, String) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees
-
Method to train a gradient boosting model
- run(RDD<LabeledPoint>, Strategy, int, String, long, Option<org.apache.spark.ml.util.Instrumentation>, boolean, Option<String>) - Static method in class org.apache.spark.ml.tree.impl.RandomForest
-
Train a random forest.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-
Run Logistic Regression with the configured parameters on an input RDD
of LabeledPoint entries.
- run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-
Run Logistic Regression with the configured parameters on an input RDD
of LabeledPoint entries starting from the initial weights provided.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Runs the bisecting k-means algorithm.
- run(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Java-friendly version of run()
.
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Perform expectation maximization
- run(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Java-friendly version of run()
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Train a K-means model on the given set of points; data
should be cached for high
performance, because this is an iterative algorithm.
- run(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LDA
-
Learn an LDA model using the given dataset.
- run(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LDA
-
Java-friendly version of run()
- run(Graph<Object, Object>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Run the PIC algorithm on Graph.
- run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Run the PIC algorithm.
- run(JavaRDD<Tuple3<Long, Long, Double>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
A Java-friendly version of PowerIterationClustering.run
.
- run(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.AssociationRules
-
Computes the association rules with confidence above minConfidence
.
- run(RDD<FPGrowth.FreqItemset<Item>>, Map<Item, Object>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.AssociationRules
-
Computes the association rules with confidence above minConfidence
.
- run(JavaRDD<FPGrowth.FreqItemset<Item>>) - Method in class org.apache.spark.mllib.fpm.AssociationRules
-
Java-friendly version of run
.
- run(RDD<Object>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Computes an FP-Growth model that contains frequent itemsets.
- run(JavaRDD<Basket>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Java-friendly version of run
.
- run(RDD<Object[]>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
-
Finds the complete set of frequent sequential patterns in the input sequences of itemsets.
- run(JavaRDD<Sequence>) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
-
A Java-friendly version of
run()
that reads sequences from a
JavaRDD
and returns
frequent sequences in a
PrefixSpanModel
.
- run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Run ALS with the configured parameters on an input RDD of
Rating
objects.
- run(JavaRDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Java-friendly version of ALS.run
.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Run the algorithm with the configured parameters on an input
RDD of LabeledPoint entries.
- run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Run the algorithm with the configured parameters on an input RDD
of LabeledPoint entries starting from the initial weights provided.
- run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-
Run IsotonicRegression algorithm to obtain isotonic regression model.
- run(JavaRDD<Tuple3<Double, Double, Double>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-
Run pool adjacent violators algorithm to obtain isotonic regression model.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model over an RDD
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Method to train a gradient boosting model
- run(JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees.run
.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model over an RDD
- run(SparkSession, SparkPlan) - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
-
- run(SparkSession, SparkPlan) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
-
- run(SparkSession, SparkPlan) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
Inserts all the rows in the table into Hive.
- run() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread
-
- run() - Method in class org.apache.spark.util.SparkShutdownHook
-
- runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Method in class org.apache.spark.SparkContext
-
Developer API
Run a job that can return approximate results.
- runId() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
Returns the unique id of this run of the query.
- runId() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryStartedEvent
-
- runId() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryTerminatedEvent
-
- runId() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
- runInNewThread(String, boolean, Function0<T>) - Static method in class org.apache.spark.util.ThreadUtils
-
Run a piece of code in a new thread and return the result.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a function on a given set of partitions in an RDD and pass the results to the given
handler function.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a function on a given set of partitions in an RDD and return the results as an array.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a function on a given set of partitions in an RDD and return the results as an array.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and return the results in an array.
- runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and return the results in an array.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and pass the results to a handler function.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and pass the results to a handler function.
- runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS
-
Run Limited-memory BFGS (L-BFGS) in parallel.
- runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector, double) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
-
Run stochastic gradient descent (SGD) in parallel using mini batches.
- runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
-
Alias of runMiniBatchSGD
with convergenceTol set to default value of 0.001.
- running() - Method in class org.apache.spark.scheduler.TaskInfo
-
- RUNNING() - Static method in class org.apache.spark.TaskState
-
- runningTasks() - Method in interface org.apache.spark.scheduler.Schedulable
-
- runParallelPersonalizedPageRank(Graph<VD, ED>, int, double, long[], ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run Personalized PageRank for a fixed number of iterations, for a
set of starting nodes in parallel.
- runPreCanonicalized(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
-
- runSqlHive(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Runs a HiveQL command using Hive, returning the results as a list of strings.
- runtime() - Method in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo
-
- RuntimeConfig - Class in org.apache.spark.sql
-
Runtime configuration interface for Spark.
- RuntimeInfo - Class in org.apache.spark.status.api.v1
-
- RuntimePercentage - Class in org.apache.spark.scheduler
-
- RuntimePercentage(double, Option<Object>, double) - Constructor for class org.apache.spark.scheduler.RuntimePercentage
-
- runUntilConvergence(Graph<VD, ED>, double, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
PageRank and edge attributes containing the normalized edge weight.
- runUntilConvergenceWithOptions(Graph<VD, ED>, double, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
PageRank and edge attributes containing the normalized edge weight.
- runWithOptions(Graph<VD, ED>, int, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run PageRank for a fixed number of iterations returning a graph
with vertex attributes containing the PageRank and edge
attributes the normalized edge weight.
- runWithValidation(RDD<LabeledPoint>, RDD<LabeledPoint>, BoostingStrategy, long, String) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees
-
Method to validate a gradient boosting model
- runWithValidation(RDD<LabeledPoint>, RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Method to validate a gradient boosting model
- runWithValidation(JavaRDD<LabeledPoint>, JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees.runWithValidation
.
- RUtils - Class in org.apache.spark.api.r
-
- RUtils() - Constructor for class org.apache.spark.api.r.RUtils
-
- RWrappers - Class in org.apache.spark.ml.r
-
This is the Scala stub of SparkR read.ml.
- RWrappers() - Constructor for class org.apache.spark.ml.r.RWrappers
-
- RWrapperUtils - Class in org.apache.spark.ml.r
-
- RWrapperUtils() - Constructor for class org.apache.spark.ml.r.RWrapperUtils
-
- s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- safeCall(Function0<T>) - Method in interface org.apache.spark.security.CryptoStreamUtils.BaseErrorHandler
-
- sameThread() - Static method in class org.apache.spark.util.ThreadUtils
-
An ExecutionContextExecutor
that runs each task in the thread that invokes execute/submit
.
- sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a sampled subset of this RDD.
- sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a sampled subset of this RDD with a random seed.
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a sampled subset of this RDD, with a user-supplied seed.
- sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD
-
Return a sampled subset of this RDD.
- sample(double, long) - Method in class org.apache.spark.sql.Dataset
-
Returns a new
Dataset
by sampling a fraction of rows (without replacement),
using a user-supplied seed.
- sample(double) - Method in class org.apache.spark.sql.Dataset
-
Returns a new
Dataset
by sampling a fraction of rows (without replacement),
using a random seed.
- sample(boolean, double, long) - Method in class org.apache.spark.sql.Dataset
-
Returns a new
Dataset
by sampling a fraction of rows, using a user-supplied seed.
- sample(boolean, double) - Method in class org.apache.spark.sql.Dataset
-
Returns a new
Dataset
by sampling a fraction of rows, using a random seed.
- sample() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- sample() - Method in class org.apache.spark.util.random.BernoulliSampler
-
- sample() - Method in class org.apache.spark.util.random.PoissonSampler
-
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
-
- sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler
-
take a random sample
- sample() - Method in interface org.apache.spark.util.random.RandomSampler
-
Whether to sample the next item or not.
- sampleBy(String, Map<T, Object>, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Returns a stratified sample without replacement based on the fraction given on each stratum.
- sampleBy(String, Map<T, Double>, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Returns a stratified sample without replacement based on the fraction given on each stratum.
- sampleByKey(boolean, Map<K, Double>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKey(boolean, Map<K, Double>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKeyExact(boolean, Map<K, Double>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleByKeyExact(boolean, Map<K, Double>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- SamplePathFilter - Class in org.apache.spark.ml.image
-
Filter that allows loading a fraction of HDFS files.
- SamplePathFilter() - Constructor for class org.apache.spark.ml.image.SamplePathFilter
-
- samplePointsPerPartitionHint() - Method in class org.apache.spark.RangePartitioner
-
- sampleRatio() - Method in class org.apache.spark.ml.image.SamplePathFilter
-
- sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N-1 instead of N).
- sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N-1 instead of N).
- sampleStdev() - Method in class org.apache.spark.util.StatCounter
-
Return the sample standard deviation of the values, which corrects for bias in estimating the
variance by dividing by N-1 instead of N.
- sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the standard variance by dividing by N-1 instead of N).
- sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the variance by dividing by N-1 instead of N).
- sampleVariance() - Method in class org.apache.spark.util.StatCounter
-
Return the sample variance, which corrects for bias in estimating the variance by dividing
by N-1 instead of N.
- SamplingUtils - Class in org.apache.spark.util.random
-
- SamplingUtils() - Constructor for class org.apache.spark.util.random.SamplingUtils
-
- satisfy(Distribution) - Method in interface org.apache.spark.sql.sources.v2.reader.partitioning.Partitioning
-
Returns true if this partitioning can satisfy the given distribution, which means Spark does
not need to shuffle the output data of this data source for some certain operations.
- save(String) - Method in interface org.apache.spark.ml.util.MLWritable
-
Saves this ML instance to the input path, a shortcut of write.save(path)
.
- save(String) - Method in class org.apache.spark.ml.util.MLWriter
-
Saves the ML instances to the input path.
- save(SparkContext, String, String, int, int, Vector, double, Option<Object>) - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
Helper method for saving GLM classification model metadata and data.
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- save(SparkContext, String, org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0.Data) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
-
- save(SparkContext, String, org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0.Data) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.SVMModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
-
- save(SparkContext, BisectingKMeansModel, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV1_0$
-
- save(SparkContext, BisectingKMeansModel, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV2_0$
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
- save(SparkContext, KMeansModel, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV1_0$
-
- save(SparkContext, KMeansModel, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV2_0$
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- save(SparkContext, PowerIterationClusteringModel, String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel.SaveLoadV1_0$
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
- save(SparkContext, ChiSqSelectorModel, String) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
-
Save this model to the given path.
- save(FPGrowthModel<?>, String) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel
-
Save this model to the given path.
- save(PrefixSpanModel<?>, String) - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Save this model to the given path.
- save(MatrixFactorizationModel, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
-
Saves a
MatrixFactorizationModel
, where user features are saved under
data/users
and
product features are saved under
data/products
.
- save(SparkContext, String, String, Vector, double) - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
Helper method for saving GLM regression model metadata and data.
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LassoModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- save(SparkContext, String, DecisionTreeModel) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- save(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Saveable
-
Save this model to the given path.
- save(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the DataFrame
at the specified path.
- save() - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the DataFrame
as the specified table.
- Saveable - Interface in org.apache.spark.mllib.util
-
Developer API
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
that storage system.
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
that storage system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
- saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- SaveAsHiveFile - Interface in org.apache.spark.sql.hive.execution
-
- saveAsHiveFile(SparkSession, SparkPlan, Configuration, org.apache.spark.sql.hive.HiveShim.ShimFileSinkDesc, String, Map<Map<String, String>, String>, Seq<Attribute>) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
-
- saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Save labeled data in LIBSVM format.
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported storage system, using
a Configuration object for that storage system.
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop
Configuration object for that storage system.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
- saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a SequenceFile of serialized objects.
- saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a SequenceFile of serialized objects.
- saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
-
Save each RDD in this DStream as a Sequence file of serialized objects.
- saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions
-
Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key
and value types.
- saveAsTable(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the DataFrame
as the specified table.
- saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a text file, using string representations of elements.
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a compressed text file, using string representations of elements.
- saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a text file, using string representations of elements.
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a compressed text file, using string representations of elements.
- saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
-
Save each RDD in this DStream as at text file, using string representation
of elements.
- savedTasks() - Method in class org.apache.spark.status.LiveStage
-
- saveImpl(Params, PipelineStage[], SparkContext, String) - Method in class org.apache.spark.ml.Pipeline.SharedReadWrite$
-
Save metadata and stages for a
Pipeline
or
PipelineModel
- save metadata to path/metadata
- save stages to stages/IDX_UID
- saveImpl(M, String, SparkSession, JsonAST.JObject) - Static method in class org.apache.spark.ml.tree.EnsembleModelReadWrite
-
Helper method for saving a tree ensemble to disk.
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClusteringModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
- SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- SaveLoadV2_0$() - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
-
- SaveLoadV2_0$() - Constructor for class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV2_0$
-
- SaveLoadV2_0$() - Constructor for class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV2_0$
-
- SaveMode - Enum in org.apache.spark.sql
-
SaveMode is used to specify the expected behavior of saving a DataFrame to a data source.
- sc() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- sc() - Method in interface org.apache.spark.ml.util.BaseReadWrite
-
Returns the underlying `SparkContext`.
- sc() - Method in class org.apache.spark.sql.SQLImplicits.StringToColumn
-
- scal(double, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS
-
x = a * x
- scal(double, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
x = a * x
- scalaBoolean() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for Scala's primitive boolean type.
- scalaByte() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for Scala's primitive byte type.
- scalaDouble() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for Scala's primitive double type.
- scalaFloat() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for Scala's primitive float type.
- scalaInt() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for Scala's primitive int type.
- scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- scalaLong() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for Scala's primitive long type.
- scalaShort() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for Scala's primitive short type.
- scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- scalaVersion() - Method in class org.apache.spark.status.api.v1.RuntimeInfo
-
- scale() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- scale() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- scale() - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- scale() - Method in class org.apache.spark.sql.types.Decimal
-
- scale() - Method in class org.apache.spark.sql.types.DecimalType
-
- scalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-
the vector to multiply with input vectors
- scalingVec() - Method in class org.apache.spark.mllib.feature.ElementwiseProduct
-
- Schedulable - Interface in org.apache.spark.scheduler
-
An interface for schedulable entities.
- SchedulableBuilder - Interface in org.apache.spark.scheduler
-
An interface to build Schedulable tree
buildPools: build the tree nodes(pools)
addTaskSetManager: build the leaf nodes(TaskSetManagers)
- schedulableQueue() - Method in interface org.apache.spark.scheduler.Schedulable
-
- SCHEDULED() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
-
- SCHEDULER_DELAY() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.ToolTips
-
- SchedulerBackend - Interface in org.apache.spark.scheduler
-
A backend interface for scheduling systems that allows plugging in different ones under
TaskSchedulerImpl.
- SchedulerBackendUtils - Class in org.apache.spark.scheduler.cluster
-
- SchedulerBackendUtils() - Constructor for class org.apache.spark.scheduler.cluster.SchedulerBackendUtils
-
- schedulerDelay() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- schedulerDelay(TaskData) - Static method in class org.apache.spark.status.AppStatusUtils
-
- schedulerDelay(long, long, long, long, long, long) - Static method in class org.apache.spark.status.AppStatusUtils
-
- SchedulerPool - Class in org.apache.spark.status
-
- SchedulerPool(String) - Constructor for class org.apache.spark.status.SchedulerPool
-
- SchedulingAlgorithm - Interface in org.apache.spark.scheduler
-
An interface for sort algorithm
FIFO: FIFO algorithm between TaskSetManagers
FS: FS algorithm between Pools, and FIFO or FS within Pools
- schedulingDelay() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
-
- schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for the first job of this batch to start processing from the time this batch
was submitted to the streaming scheduler.
- schedulingMode() - Method in interface org.apache.spark.scheduler.Schedulable
-
- SchedulingMode - Class in org.apache.spark.scheduler
-
"FAIR" and "FIFO" determines which policy is used
to order tasks amongst a Schedulable's sub-queues
"NONE" is used when the a Schedulable has no sub-queues.
- SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
-
- schedulingMode() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- schedulingPool() - Method in class org.apache.spark.status.api.v1.StageData
-
- schedulingPool() - Method in class org.apache.spark.status.LiveStage
-
- schema(StructType) - Method in class org.apache.spark.sql.DataFrameReader
-
Specifies the input schema.
- schema(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Specifies the schema by using the input DDL-formatted string.
- schema() - Method in class org.apache.spark.sql.Dataset
-
Returns the schema of this Dataset.
- schema() - Method in interface org.apache.spark.sql.Encoder
-
Returns the schema of encoding this type of object as a Row.
- schema() - Method in interface org.apache.spark.sql.Row
-
Schema for the row.
- schema() - Method in class org.apache.spark.sql.sources.BaseRelation
-
- schema(StructType) - Method in class org.apache.spark.sql.streaming.DataStreamReader
-
Specifies the input schema.
- schema(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader
-
Specifies the schema by using the input DDL-formatted string.
- schema_of_json(String) - Static method in class org.apache.spark.sql.functions
-
Parses a JSON string and infers its schema in DDL format.
- schema_of_json(Column) - Static method in class org.apache.spark.sql.functions
-
Parses a JSON string and infers its schema in DDL format.
- schemaLess() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- SchemaRelationProvider - Interface in org.apache.spark.sql.sources
-
Implemented by objects that produce relations for a specific kind of data source
with a given schema.
- SchemaUtils - Class in org.apache.spark.ml.util
-
Utils for handling schemas.
- SchemaUtils() - Constructor for class org.apache.spark.ml.util.SchemaUtils
-
- SchemaUtils - Class in org.apache.spark.sql.util
-
Utils for handling schemas.
- SchemaUtils() - Constructor for class org.apache.spark.sql.util.SchemaUtils
-
- scope() - Method in class org.apache.spark.storage.RDDInfo
-
- scoreAndLabels() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
- scratch() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
-
- Scripts() - Method in interface org.apache.spark.sql.hive.HiveStrategies
-
- Scripts() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.Scripts
-
- Scripts$() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.Scripts$
-
- ScriptTransformationExec - Class in org.apache.spark.sql.hive.execution
-
Transforms the input by forking and running the specified script.
- ScriptTransformationExec(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveScriptIOSchema) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformationExec
-
- ScriptTransformationWriterThread - Class in org.apache.spark.sql.hive.execution
-
- ScriptTransformationWriterThread(Iterator<InternalRow>, Seq<DataType>, org.apache.spark.sql.catalyst.expressions.Projection, AbstractSerDe, ObjectInspector, HiveScriptIOSchema, OutputStream, Process, org.apache.spark.util.CircularBuffer, TaskContext, Configuration) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread
-
- second(Column) - Static method in class org.apache.spark.sql.functions
-
Extracts the seconds as an integer from a given date/timestamp/string.
- seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- seconds(long) - Static method in class org.apache.spark.streaming.Durations
-
- Seconds - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of seconds.
- Seconds() - Constructor for class org.apache.spark.streaming.Seconds
-
- securityManager() - Method in class org.apache.spark.SparkEnv
-
- securityManager() - Method in interface org.apache.spark.status.api.v1.UIRoot
-
- seed() - Method in interface org.apache.spark.ml.param.shared.HasSeed
-
Param for random seed.
- seedParam() - Static method in class org.apache.spark.ml.image.SamplePathFilter
-
- select(Column...) - Method in class org.apache.spark.sql.Dataset
-
Selects a set of column based expressions.
- select(String, String...) - Method in class org.apache.spark.sql.Dataset
-
Selects a set of columns.
- select(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Selects a set of column based expressions.
- select(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Selects a set of columns.
- select(TypedColumn<T, U1>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
Returns a new Dataset by computing the given
Column
expression for each element.
- select(TypedColumn<T, U1>, TypedColumn<T, U2>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
Returns a new Dataset by computing the given
Column
expressions for each element.
- select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
Returns a new Dataset by computing the given
Column
expressions for each element.
- select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>, TypedColumn<T, U4>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
Returns a new Dataset by computing the given
Column
expressions for each element.
- select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>, TypedColumn<T, U4>, TypedColumn<T, U5>) - Method in class org.apache.spark.sql.Dataset
-
Experimental
Returns a new Dataset by computing the given
Column
expressions for each element.
- selectedFeatures() - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-
list of indices to select (filter).
- selectedFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
- selectExpr(String...) - Method in class org.apache.spark.sql.Dataset
-
Selects a set of SQL expressions.
- selectExpr(Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Selects a set of SQL expressions.
- selectorType() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
-
The selector type of the ChisqSelector.
- selectorType() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- self() - Method in interface org.apache.spark.rpc.RpcEndpoint
-
- sendData(String, Seq<Object>) - Method in interface org.apache.spark.streaming.kinesis.KinesisDataGenerator
-
Sends the data to Kinesis and returns the metadata for everything that has been sent.
- sender() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
-
- senderAddress() - Method in interface org.apache.spark.rpc.RpcCallContext
-
The sender of this message.
- sendFailure(Throwable) - Method in interface org.apache.spark.rpc.RpcCallContext
-
Report a failure to the sender.
- sendToDst(A) - Method in class org.apache.spark.graphx.EdgeContext
-
Sends a message to the destination vertex.
- sendToDst(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- sendToSrc(A) - Method in class org.apache.spark.graphx.EdgeContext
-
Sends a message to the source vertex.
- sendToSrc(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- sendWith(TransportClient) - Method in interface org.apache.spark.rpc.netty.OutboxMessage
-
- seqToString(Seq<T>, Function1<T, String>) - Static method in class org.apache.spark.internal.config.ConfigHelpers
-
- sequence() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
-
- sequence(Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Generate a sequence of integers from start to stop, incrementing by step.
- sequence(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Generate a sequence of integers from start to stop,
incrementing by 1 if start is less than or equal to stop, otherwise -1.
- sequenceCol() - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
Param for the name of the sequence column in dataset (default "sequence"), rows with
nulls in this column are ignored.
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get an RDD for a Hadoop SequenceFile.
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext
-
Version of sequenceFile() for types implicitly convertible to Writables through a
WritableConverter.
- SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile,
through an implicit conversion.
- SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Class<? extends Writable>, Class<? extends Writable>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
-
- SER_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SerDe - Class in org.apache.spark.api.r
-
Utility functions to serialize, deserialize objects to / from R
- SerDe() - Constructor for class org.apache.spark.api.r.SerDe
-
- SERDE() - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- serde() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- serdeProperties() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
-
- SerializableMapWrapper(Map<A, B>) - Constructor for class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
-
- SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
-
- SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
-
- SerializationDebugger - Class in org.apache.spark.serializer
-
- SerializationDebugger() - Constructor for class org.apache.spark.serializer.SerializationDebugger
-
- SerializationDebugger.ObjectStreamClassMethods - Class in org.apache.spark.serializer
-
An implicit class that allows us to call private methods of ObjectStreamClass.
- SerializationDebugger.ObjectStreamClassMethods$ - Class in org.apache.spark.serializer
-
- SerializationFormats - Class in org.apache.spark.api.r
-
- SerializationFormats() - Constructor for class org.apache.spark.api.r.SerializationFormats
-
- SerializationStream - Class in org.apache.spark.serializer
-
Developer API
A stream for writing serialized objects.
- SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
-
- serializationStream() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
-
- serialize(Vector) - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
-
- serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- serialize(T) - Static method in class org.apache.spark.util.Utils
-
Serialize an object using Java serialization
- SERIALIZED_R_DATA_SCHEMA() - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- serializedData() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-
- serializedMapStatus(org.apache.spark.broadcast.BroadcastManager, boolean, int) - Method in class org.apache.spark.ShuffleStatus
-
Serializes the mapStatuses array into an efficient compressed format.
- SerializedMemoryEntry<T> - Class in org.apache.spark.storage.memory
-
- SerializedMemoryEntry(org.apache.spark.util.io.ChunkedByteBuffer, MemoryMode, ClassTag<T>) - Constructor for class org.apache.spark.storage.memory.SerializedMemoryEntry
-
- SerializedValuesHolder<T> - Class in org.apache.spark.storage.memory
-
A holder for storing the serialized values.
- SerializedValuesHolder(BlockId, int, ClassTag<T>, MemoryMode, org.apache.spark.serializer.SerializerManager) - Constructor for class org.apache.spark.storage.memory.SerializedValuesHolder
-
- Serializer - Class in org.apache.spark.serializer
-
Developer API
A serializer.
- Serializer() - Constructor for class org.apache.spark.serializer.Serializer
-
- serializer() - Method in class org.apache.spark.ShuffleDependency
-
- serializer() - Method in class org.apache.spark.SparkEnv
-
- SerializerInstance - Class in org.apache.spark.serializer
-
Developer API
An instance of a serializer, for use by one thread at a time.
- SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
-
- serializerManager() - Method in class org.apache.spark.SparkEnv
-
- serializeStream(OutputStream) - Method in class org.apache.spark.serializer.DummySerializerInstance
-
- serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-
- serializeViaNestedStream(OutputStream, SerializerInstance, Function1<SerializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Serialize via nested stream using specific serializer
- servletContext() - Method in interface org.apache.spark.status.api.v1.ApiRequestContext
-
- ServletParams(Function1<HttpServletRequest, T>, String, Function1<T, String>) - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams
-
- ServletParams$() - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams$
-
- session(SparkSession) - Static method in class org.apache.spark.ml.r.RWrappers
-
- session(SparkSession) - Method in interface org.apache.spark.ml.util.BaseReadWrite
-
Sets the Spark Session to use for saving/loading.
- session(SparkSession) - Method in class org.apache.spark.ml.util.GeneralMLWriter
-
- session(SparkSession) - Method in class org.apache.spark.ml.util.MLReader
-
- session(SparkSession) - Method in class org.apache.spark.ml.util.MLWriter
-
- sessionCatalog() - Method in class org.apache.spark.sql.hive.RelationConversions
-
- SessionConfigSupport - Interface in org.apache.spark.sql.sources.v2
-
- sessionState() - Method in class org.apache.spark.sql.SparkSession
-
State isolated across sessions, including SQL configurations, temporary tables, registered
functions, and everything else that accepts a SQLConf
.
- set(long, long, int, int, VD, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- Set() - Static method in class org.apache.spark.metrics.sink.StatsdMetricType
-
- set(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
-
Sets a parameter in the embedded param map.
- set(String, Object) - Method in interface org.apache.spark.ml.param.Params
-
Sets a parameter (by name) in the embedded param map.
- set(ParamPair<?>) - Method in interface org.apache.spark.ml.param.Params
-
Sets a parameter in the embedded param map.
- set(String, long, long) - Static method in class org.apache.spark.rdd.InputFileBlockHolder
-
Sets the thread-local input block.
- set(String, String) - Method in class org.apache.spark.SparkConf
-
Set a configuration variable.
- set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
-
- set(String, String) - Method in class org.apache.spark.sql.RuntimeConfig
-
Sets the given Spark runtime configuration property.
- set(String, boolean) - Method in class org.apache.spark.sql.RuntimeConfig
-
Sets the given Spark runtime configuration property.
- set(String, long) - Method in class org.apache.spark.sql.RuntimeConfig
-
Sets the given Spark runtime configuration property.
- set(long) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given Long.
- set(int) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given Int.
- set(long, int, int) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given unscaled Long, with a given precision and scale.
- set(BigDecimal, int, int) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given BigDecimal value, with a given precision and scale.
- set(BigDecimal) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given BigDecimal value, inheriting its precision and scale.
- set(BigInteger) - Method in class org.apache.spark.sql.types.Decimal
-
If the value is not in the range of long, convert it to BigDecimal and
the precision and scale are based on the converted value.
- set(Decimal) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given Decimal value.
- setActive(SQLContext) - Static method in class org.apache.spark.sql.SQLContext
-
- setActiveSession(SparkSession) - Static method in class org.apache.spark.sql.SparkSession
-
Changes the SparkSession that will be returned in this thread and its children when
SparkSession.getOrCreate() is called.
- setAggregationDepth(int) - Method in class org.apache.spark.ml.classification.LinearSVC
-
Suggested depth for treeAggregate (greater than or equal to 2).
- setAggregationDepth(int) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Suggested depth for treeAggregate (greater than or equal to 2).
- setAggregationDepth(int) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
Suggested depth for treeAggregate (greater than or equal to 2).
- setAggregationDepth(int) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Suggested depth for treeAggregate (greater than or equal to 2).
- setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set aggregator for RDD's shuffle.
- setAlgo(String) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Sets Algorithm using a String.
- setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
-
Set multiple parameters together
- setAlpha(double) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setAlpha(Vector) - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for setDocConcentration()
- setAlpha(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for setDocConcentration()
- setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Sets the constant used in computing confidence in implicit ALS.
- setAppName(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Set the application name.
- setAppName(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- setAppName(String) - Method in class org.apache.spark.SparkConf
-
Set a name for your application.
- setAppResource(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Set the main application resource.
- setAppResource(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- setBandwidth(double) - Method in class org.apache.spark.mllib.stat.KernelDensity
-
Sets the bandwidth (standard deviation) of the Gaussian kernel (default: 1.0
).
- setBeta(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for setTopicConcentration()
- setBinary(boolean) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- setBinary(boolean) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-
- setBinary(boolean) - Method in class org.apache.spark.ml.feature.HashingTF
-
- setBinary(boolean) - Method in class org.apache.spark.mllib.feature.HashingTF
-
If true, term frequency vector will be binary such that non-zero term counts will be set to 1
(default: false)
- setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of blocks for both user blocks and product blocks to parallelize the computation
into; pass -1 for an auto-configured number of blocks.
- setBlockSize(int) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
Sets the value of param blockSize
.
- setBucketLength(double) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setCacheNodeIds(boolean) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Pass-through to SparkContext.setCallSite.
- setCallSite(String) - Method in class org.apache.spark.SparkContext
-
Set the thread-local property for overriding the call sites
of actions and RDDs.
- setCaseSensitive(boolean) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- setCategoricalCols(String[]) - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- setCategoricalFeaturesInfo(Map<Integer, Integer>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Sets categoricalFeaturesInfo using a Java Map.
- setCensorCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Set the directory under which RDDs are going to be checkpointed.
- setCheckpointDir(String) - Method in class org.apache.spark.SparkContext
-
Set the directory under which RDDs are going to be checkpointed.
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
Specifies how often to checkpoint the cached node IDs.
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
Specifies how often to checkpoint the cached node IDs.
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
Specifies how often to checkpoint the cached node IDs.
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.clustering.LDA
-
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
Specifies how often to checkpoint the cached node IDs.
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
Specifies how often to checkpoint the cached node IDs.
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
Specifies how often to checkpoint the cached node IDs.
- setCheckpointInterval(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.clustering.LDA
-
Parameter for set checkpoint interval (greater than or equal to 1) or disable checkpoint (-1).
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Developer API
Set period (in iterations) between checkpoints (default = 10).
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setClassifier(Classifier<?, ?, ?>) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- setColdStartStrategy(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setColdStartStrategy(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- setCollectSubModels(boolean) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
Whether to collect submodels when fitting.
- setCollectSubModels(boolean) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
Whether to collect submodels when fitting.
- setConf(Configuration) - Method in interface org.apache.spark.input.Configurable
-
- setConf(String, String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Set a single configuration value for the application.
- setConf(String, String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- setConf(Configuration) - Method in class org.apache.spark.ml.image.SamplePathFilter
-
- setConf(Properties) - Method in class org.apache.spark.sql.SQLContext
-
Set Spark SQL configuration properties.
- setConf(String, String) - Method in class org.apache.spark.sql.SQLContext
-
Set the given Spark SQL configuration property.
- setConfig(String, String) - Static method in class org.apache.spark.launcher.SparkLauncher
-
Set a configuration value for the launcher library.
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the largest change in log-likelihood at which convergence is
considered to have occurred.
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the convergence tolerance.
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the convergence tolerance of iterations for L-BFGS.
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the convergence tolerance.
- setCurrentDatabase(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Sets the current default database in this session.
- setCurrentDatabase(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Sets the name of current database.
- setCustomHostname(String) - Static method in class org.apache.spark.util.Utils
-
Allow setting a custom host name because when we run on Mesos we need to use the same
hostname it reports to the master.
- setDAGScheduler(DAGScheduler) - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- setDecayFactor(double) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Set the forgetfulness of the previous centroids.
- setDefault(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
-
Sets a default value for a param.
- setDefault(Seq<ParamPair<?>>) - Method in interface org.apache.spark.ml.param.Params
-
Sets default values for a list of params.
- setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer
-
Sets a class loader for the serializer to use in deserialization.
- setDefaultSession(SparkSession) - Static method in class org.apache.spark.sql.SparkSession
-
Sets the default SparkSession that is returned by the builder.
- setDegree(int) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-
- setDeployMode(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Set the deploy mode for the application.
- setDeployMode(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- setDistanceMeasure(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- setDistanceMeasure(String) - Method in class org.apache.spark.ml.clustering.KMeans
-
- setDistanceMeasure(String) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- setDistanceMeasure(String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Set the distance suite used by the algorithm.
- setDistanceMeasure(String) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the distance suite used by the algorithm.
- setDocConcentration(double[]) - Method in class org.apache.spark.ml.clustering.LDA
-
- setDocConcentration(double) - Method in class org.apache.spark.ml.clustering.LDA
-
- setDocConcentration(Vector) - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "alpha") for the prior placed on documents'
distributions over topics ("theta").
- setDocConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Replicates a Double
docConcentration to create a symmetric prior.
- setDropLast(boolean) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
- setDropLast(boolean) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
-
- setDropLast(boolean) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- setDstCol(String) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
- setElasticNetParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the ElasticNet mixing parameter.
- setElasticNetParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set the ElasticNet mixing parameter.
- setEpsilon(double) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Sets the value of param epsilon
.
- setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the distance threshold within which we've consider centers to have converged.
- setError(PrintStream) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
- setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf
-
Set an environment variable to be used when launching executors for this application.
- setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
-
Set multiple environment variables to be used when launching executors.
- setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf
-
Set multiple environment variables to be used when launching executors.
- setFamily(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Sets the value of param family
.
- setFamily(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the value of param family
.
- setFdr(double) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- setFdr(double) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- setFeatureIndex(int) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-
- setFeatureIndex(int) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.KMeans
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.KMeansModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.LDA
-
The features for LDA should be a Vector
representing the word counts in a document.
- setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.LDAModel
-
The features for LDA should be a Vector
representing the word counts in a document.
- setFeaturesCol(String) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.RFormula
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.PredictionModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.Predictor
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-
- setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setFeatureSubsetStrategy(String) - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams
-
- setFinalRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Developer API
Sets storage level for final RDDs (user/product used in MatrixFactorizationModel).
- setFinalStorageLevel(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setFitIntercept(boolean) - Method in class org.apache.spark.ml.classification.LinearSVC
-
Whether to fit an intercept term.
- setFitIntercept(boolean) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Whether to fit an intercept term.
- setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
Set if we should fit the intercept
Default is true.
- setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets if we should fit the intercept.
- setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set if we should fit the intercept.
- setForceIndexLabel(boolean) - Method in class org.apache.spark.ml.feature.RFormula
-
- setFormula(String) - Method in class org.apache.spark.ml.feature.RFormula
-
Sets the formula to use for this transformer.
- setFpr(double) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- setFpr(double) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- setFwe(double) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- setFwe(double) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- setGaps(boolean) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the gradient function (of the loss function of one single data example)
to be used for SGD.
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the gradient function (of the loss function of one single data example)
to be used for L-BFGS.
- setHalfLife(double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Set the half life and time unit ("batches" or "points").
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
-
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.RFormula
-
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.VectorSizeHint
-
- setHashAlgorithm(String) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Set the hash algorithm used when mapping term to integer.
- setIfMissing(String, String) - Method in class org.apache.spark.SparkConf
-
Set a parameter if it isn't already configured
- setImplicitPrefs(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Sets whether to use implicit preference.
- setImpurity(String) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setImpurity(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
The impurity setting is ignored for GBT models.
- setImpurity(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setImpurity(String) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setImpurity(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
The impurity setting is ignored for GBT models.
- setImpurity(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setImpurity(String) - Method in interface org.apache.spark.ml.tree.TreeClassifierParams
-
- setImpurity(String) - Method in interface org.apache.spark.ml.tree.TreeRegressorParams
-
- setImpurity(Impurity) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setIndices(int[]) - Method in class org.apache.spark.ml.feature.VectorSlicer
-
- setInfo(PrintStream) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
- setInitialCenters(Vector[], double[]) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Specify initial centers directly.
- setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the initialization algorithm.
- setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Set the initialization mode.
- setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the number of steps for the k-means|| initialization mode.
- setInitialModel(GaussianMixtureModel) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the initial GMM starting point, bypassing the random initialization.
- setInitialModel(KMeansModel) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the initial starting point, bypassing the random initialization or k-means||
The condition model.k == this.k must be met, failure results
in an IllegalArgumentException.
- setInitialWeights(Vector) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
Sets the value of param initialWeights
.
- setInitialWeights(Vector) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the initial weights.
- setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the initial weights.
- setInitMode(String) - Method in class org.apache.spark.ml.clustering.KMeans
-
- setInitMode(String) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
- setInitSteps(int) - Method in class org.apache.spark.ml.clustering.KMeans
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.IDF
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.IndexToString
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.MinHashLSH
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.MinHashLSHModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
- setInputCol(String) - Method in class org.apache.spark.ml.feature.PCA
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.PCAModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorSizeHint
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorSlicer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setInputCols(Seq<String>) - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.Imputer
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.ImputerModel
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.Interaction
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Set if the algorithm should add an intercept.
- setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Developer API
Sets storage level for intermediate RDDs (user/product in/out links).
- setIntermediateStorageLevel(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setInverse(boolean) - Method in class org.apache.spark.ml.feature.DCT
-
- setIsotonic(boolean) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-
- setIsotonic(boolean) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-
Sets the isotonic parameter.
- setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- setItemsCol(String) - Method in class org.apache.spark.ml.fpm.FPGrowth
-
- setItemsCol(String) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
-
- setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of iterations to run.
- setJars(Seq<String>) - Method in class org.apache.spark.SparkConf
-
Set JAR files to distribute to the cluster.
- setJars(String[]) - Method in class org.apache.spark.SparkConf
-
Set JAR files to distribute to the cluster.
- setJavaHome(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set a custom JAVA_HOME for launching the Spark application.
- setJobDescription(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Set a human readable description of the current job.
- setJobDescription(String) - Method in class org.apache.spark.SparkContext
-
Set a human readable description of the current job.
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setK(int) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- setK(int) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- setK(int) - Method in class org.apache.spark.ml.clustering.KMeans
-
- setK(int) - Method in class org.apache.spark.ml.clustering.LDA
-
- setK(int) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
- setK(int) - Method in class org.apache.spark.ml.feature.PCA
-
- setK(int) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Sets the desired number of leaf clusters (default: 4).
- setK(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the number of Gaussians in the mixture model.
- setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the number of clusters to create (k).
- setK(int) - Method in class org.apache.spark.mllib.clustering.LDA
-
Set the number of topics to infer, i.e., the number of soft cluster centers.
- setK(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Set the number of clusters.
- setK(int) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Set the number of clusters.
- setKappa(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Learning rate: exponential decay rate---should be between
(0.5, 1.0] to guarantee asymptotic convergence.
- setKeepLastCheckpoint(boolean) - Method in class org.apache.spark.ml.clustering.LDA
-
- setKeepLastCheckpoint(boolean) - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-
If using checkpointing, this indicates whether to keep the last checkpoint (vs clean up).
- setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set key ordering for RDD's shuffle.
- setLabelCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- setLabelCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- setLabelCol(String) - Method in class org.apache.spark.ml.feature.RFormula
-
- setLabelCol(String) - Method in class org.apache.spark.ml.Predictor
-
- setLabelCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- setLabelCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-
- setLabels(String[]) - Method in class org.apache.spark.ml.feature.IndexToString
-
- setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Set the smoothing parameter.
- setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the regularization parameter, lambda.
- setLayers(int[]) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
Sets the value of param layers
.
- setLearningDecay(double) - Method in class org.apache.spark.ml.clustering.LDA
-
- setLearningOffset(double) - Method in class org.apache.spark.ml.clustering.LDA
-
- setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets initial learning rate (default: 0.025).
- setLearningRate(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setLink(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the value of param link
.
- setLinkPower(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the value of param linkPower
.
- setLinkPredictionCol(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the link prediction (linear predictor) column name.
- setLinkPredictionCol(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
-
Sets the link prediction (linear predictor) column name.
- setLocale(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Set a local property that affects jobs submitted from this thread, and all child
threads, such as the Spark fair scheduler pool.
- setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext
-
Set a local property that affects jobs submitted from this thread, such as the Spark fair
scheduler pool.
- setLogLevel(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Control our logLevel.
- setLogLevel(String) - Method in class org.apache.spark.SparkContext
-
Control our logLevel.
- setLogLevel(Level) - Static method in class org.apache.spark.util.Utils
-
configure a new log4j level
- setLoss(String) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Sets the value of param loss
.
- setLoss(Loss) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setLossType(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setLossType(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setLowerBoundsOnCoefficients(Matrix) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the lower bounds on coefficients if fitting under bound constrained optimization.
- setLowerBoundsOnIntercepts(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the lower bounds on intercepts if fitting under bound constrained optimization.
- setMainClass(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Sets the application class name for Java/Scala applications.
- setMainClass(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set mapSideCombine flag for RDD's shuffle.
- setMaster(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Set the Spark master for the application.
- setMaster(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- setMaster(String) - Method in class org.apache.spark.SparkConf
-
The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to
run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
- setMax(double) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-
- setMax(double) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-
- setMaxBins(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMaxBins(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMaxBins(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMaxBins(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMaxBins(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMaxBins(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMaxBins(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- setMaxBins(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMaxCategories(int) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMaxDepth(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- setMaxDepth(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMaxDF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.LinearSVC
-
Set the maximum number of iterations.
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the maximum number of iterations.
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
Set the maximum number of iterations.
- setMaxIter(int) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- setMaxIter(int) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- setMaxIter(int) - Method in class org.apache.spark.ml.clustering.KMeans
-
- setMaxIter(int) - Method in class org.apache.spark.ml.clustering.LDA
-
- setMaxIter(int) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
- setMaxIter(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setMaxIter(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
Set the maximum number of iterations.
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the maximum number of iterations (applicable for solver "irls").
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set the maximum number of iterations.
- setMaxIter(int) - Method in interface org.apache.spark.ml.tree.GBTParams
-
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Sets the max number of k-means iterations to split clusters (default: 20).
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the maximum number of iterations allowed.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set maximum number of iterations allowed.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.LDA
-
Set the maximum number of iterations allowed.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Set maximum number of iterations of the power iteration loop
- setMaxLocalProjDBSize(long) - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
- setMaxLocalProjDBSize(long) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
-
Sets the maximum number of items (including delimiters used in the internal storage format)
allowed in a projected database before local processing (default: 32000000L
).
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMaxMemoryInMB(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMaxPatternLength(int) - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
- setMaxPatternLength(int) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
-
Sets maximal pattern length (default: 10
).
- setMaxSentenceLength(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setMaxSentenceLength(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets the maximum length (in words) of each sentence in the input data.
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- setMin(double) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-
- setMin(double) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-
- setMinConfidence(double) - Method in class org.apache.spark.ml.fpm.FPGrowth
-
- setMinConfidence(double) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
-
- setMinConfidence(double) - Method in class org.apache.spark.mllib.fpm.AssociationRules
-
Sets the minimal confidence (default: 0.8
).
- setMinCount(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setMinCount(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets minCount, the minimum number of times a token must appear to be included in the word2vec
model's vocabulary (default: 5).
- setMinDF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- setMinDivisibleClusterSize(double) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- setMinDivisibleClusterSize(double) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Sets the minimum number of points (if greater than or equal to 1.0
) or the minimum proportion
of points (if less than 1.0
) of a divisible cluster (default: 1).
- setMinDocFreq(int) - Method in class org.apache.spark.ml.feature.IDF
-
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the fraction of each batch to use for updates.
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Mini-batch fraction in (0, 1], which sets the fraction of document sampled and used in
each iteration.
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set fraction of data to be used for each SGD iteration.
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the fraction of each batch to use for updates.
- setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMinInfoGain(double) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- setMinInfoGain(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMinInstancesPerNode(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMinSupport(double) - Method in class org.apache.spark.ml.fpm.FPGrowth
-
- setMinSupport(double) - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
- setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Sets the minimal support level (default: 0.3
).
- setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.PrefixSpan
-
Sets the minimal support level (default: 0.1
).
- setMinTF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- setMinTF(double) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-
- setMinTokenLength(int) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- setMissingValue(double) - Method in class org.apache.spark.ml.feature.Imputer
-
- setModelType(String) - Method in class org.apache.spark.ml.classification.NaiveBayes
-
Set the model type using a string (case-sensitive).
- setModelType(String) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Set the model type using a string (case-sensitive).
- setN(int) - Method in class org.apache.spark.ml.feature.NGram
-
- setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.api.java.JavaRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- setName(String) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- setName(String) - Method in class org.apache.spark.rdd.RDD
-
Assign a name to this RDD
- setNames(String[]) - Method in class org.apache.spark.ml.feature.VectorSlicer
-
- setNonnegative(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set whether the least-squares problems solved at each iteration should have
nonnegativity constraints.
- setNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- setNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
-
- setNumBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
Sets both numUserBlocks and numItemBlocks to the specific value.
- setNumBuckets(int) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- setNumBucketsArray(int[]) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- setNumClasses(int) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-
Set the number of possible outcomes for k classes classification problem in
Multinomial Logistic Regression.
- setNumClasses(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the number of corrections used in the LBFGS update.
- setNumFeatures(int) - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- setNumFeatures(int) - Method in class org.apache.spark.ml.feature.HashingTF
-
- setNumFolds(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setNumHashTables(int) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
-
- setNumHashTables(int) - Method in class org.apache.spark.ml.feature.MinHashLSH
-
- setNumItemBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setNumIterations(int) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the number of iterations of gradient descent to run per update.
- setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets number of iterations (default: 1), which should be smaller than or equal to number of
partitions.
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the number of iterations for SGD.
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the maximal number of iterations for L-BFGS.
- setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the number of iterations of gradient descent to run per update.
- setNumIterations(int) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setNumPartitions(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setNumPartitions(int) - Method in class org.apache.spark.ml.fpm.FPGrowth
-
- setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets number of partitions (default: 1).
- setNumPartitions(int) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Sets the number of partitions used by parallel FP-growth (default: same as input data).
- setNumRows(int) - Method in class org.apache.spark.sql.vectorized.ColumnarBatch
-
Sets the number of rows in this batch.
- setNumTopFeatures(int) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- setNumTopFeatures(int) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- setNumTrees(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setNumTrees(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setNumTrees(int) - Method in interface org.apache.spark.ml.tree.RandomForestParams
-
- setNumUserBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setOffsetCol(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the value of param offsetCol
.
- setOffsetRange(Optional<Offset>, Optional<Offset>) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader
-
Set the desired offset range for input partitions created from this reader.
- setOptimizeDocConcentration(boolean) - Method in class org.apache.spark.ml.clustering.LDA
-
- setOptimizeDocConcentration(boolean) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Sets whether to optimize docConcentration parameter during training.
- setOptimizer(String) - Method in class org.apache.spark.ml.clustering.LDA
-
- setOptimizer(LDAOptimizer) - Method in class org.apache.spark.mllib.clustering.LDA
-
Developer API
- setOptimizer(String) - Method in class org.apache.spark.mllib.clustering.LDA
-
Set the LDAOptimizer used to perform the actual calculation by algorithm name.
- setOrNull(long, int, int) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given unscaled Long, with a given precision and scale,
and return it, or return null if it cannot be set due to overflow.
- setOut(PrintStream) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDF
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.IndexToString
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Interaction
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinHashLSH
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinHashLSHModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.PCA
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.PCAModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorSlicer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
-
- setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.Imputer
-
- setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.ImputerModel
-
- setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
-
- setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- setP(double) - Method in class org.apache.spark.ml.feature.Normalizer
-
- setParallelism(int) - Method in class org.apache.spark.ml.classification.OneVsRest
-
The implementation of parallel one vs.
- setParallelism(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
Set the maximum level of parallelism to evaluate models in parallel.
- setParallelism(int) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
Set the maximum level of parallelism to evaluate models in parallel.
- setParent(Estimator<M>) - Method in class org.apache.spark.ml.Model
-
Sets the parent of this model (Java API).
- setPattern(String) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- setPeacePeriod(int) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-
Set the number of initial batches to ignore.
- setPercentile(double) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- setPercentile(double) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.KMeans
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.KMeansModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.fpm.FPGrowth
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.PredictionModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.Predictor
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-
- setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-
- setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
-
- setProbabilityCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- setProbabilityCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
- setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of product blocks to parallelize the computation.
- setPropertiesFile(String) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Set a custom properties file with Spark configuration for the application.
- setPropertiesFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
- setQuantileCalculationStrategy(Enumeration.Value) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setQuantileProbabilities(double[]) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- setQuantileProbabilities(double[]) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- setQuantilesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- setQuantilesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- setRandomCenters(int, double, long) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Initialize random centers, requiring only the number of dimensions.
- setRank(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the rank of the feature matrices computed (number of features).
- setRatingCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.ClassificationModel
-
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.Classifier
-
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- setRegParam(double) - Method in class org.apache.spark.ml.classification.LinearSVC
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setRegParam(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the regularization parameter for L2 regularization.
- setRegParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the regularization parameter.
- setRelativeError(double) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- setRequiredColumns(Configuration, StructType, StructType) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- setRest(long, int, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
- setSample(RDD<Object>) - Method in class org.apache.spark.mllib.stat.KernelDensity
-
Sets the sample to use for density estimation.
- setSample(JavaRDD<Double>) - Method in class org.apache.spark.mllib.stat.KernelDensity
-
Sets the sample to use for density estimation (for Java users).
- setScalingVec(Vector) - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-
- setSeed(long) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setSeed(long) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setSeed(long) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
Set the seed for weights initialization if weights are not set
- setSeed(long) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setSeed(long) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- setSeed(long) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- setSeed(long) - Method in class org.apache.spark.ml.clustering.KMeans
-
- setSeed(long) - Method in class org.apache.spark.ml.clustering.LDA
-
- setSeed(long) - Method in class org.apache.spark.ml.clustering.LDAModel
-
- setSeed(long) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
-
- setSeed(long) - Method in class org.apache.spark.ml.feature.MinHashLSH
-
- setSeed(long) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setSeed(long) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setSeed(long) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setSeed(long) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setSeed(long) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setSeed(long) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
-
- setSeed(long) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setSeed(long) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans
-
Sets the random seed (default: hash value of the class name).
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the random seed
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the random seed for cluster initialization.
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.LDA
-
Set the random seed for cluster initialization.
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
Set the random seed for cluster initialization.
- setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets random seed (default: a random long integer).
- setSeed(long) - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.WeibullGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Sets a random seed to have deterministic results.
- setSeed(long) - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
-
- setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
-
- setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom
-
Set random seed.
- setSelectorType(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- setSelectorType(String) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
- setSequenceCol(String) - Method in class org.apache.spark.ml.fpm.PrefixSpan
-
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD
-
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
- setSize(int) - Method in class org.apache.spark.ml.feature.VectorSizeHint
-
- setSmoothing(double) - Method in class org.apache.spark.ml.classification.NaiveBayes
-
Set the smoothing parameter.
- setSolver(String) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
Sets the value of param solver
.
- setSolver(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the solver algorithm used for optimization.
- setSolver(String) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set the solver algorithm used for optimization.
- setSparkContextSessionConf(SparkSession, Map<Object, Object>) - Static method in class org.apache.spark.sql.api.r.SQLUtils
-
- setSparkHome(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set a custom Spark installation location for the application.
- setSparkHome(String) - Method in class org.apache.spark.SparkConf
-
Set the location where Spark is installed on worker nodes.
- setSplits(double[]) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setSplitsArray(double[][]) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setSQLReadObject(Function2<DataInputStream, Object, Object>) - Static method in class org.apache.spark.api.r.SerDe
-
- setSQLWriteObject(Function2<DataOutputStream, Object, Object>) - Static method in class org.apache.spark.api.r.SerDe
-
- setSrcCol(String) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
- setSrcOnly(long, int, VD) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- setStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline
-
- setStandardization(boolean) - Method in class org.apache.spark.ml.classification.LinearSVC
-
Whether to standardize the training features before fitting the model.
- setStandardization(boolean) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Whether to standardize the training features before fitting the model.
- setStandardization(boolean) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Whether to standardize the training features before fitting the model.
- setStartOffset(Optional<Offset>) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader
-
Set the desired start offset for partitions created from this reader.
- setStatement(String) - Method in class org.apache.spark.ml.feature.SQLTransformer
-
- setStepSize(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setStepSize(double) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
Sets the value of param stepSize
(applicable only for solver "gd").
- setStepSize(double) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setStepSize(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setStepSize(double) - Method in interface org.apache.spark.ml.tree.GBTParams
-
- setStepSize(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the step size for gradient descent.
- setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the initial step size of SGD for the first step.
- setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the step size for gradient descent.
- setStopWords(String[]) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- setStorageLevel(String) - Method in class org.apache.spark.status.LiveRDD
-
- setStrategy(String) - Method in class org.apache.spark.ml.feature.Imputer
-
Imputation strategy.
- setStringIndexerOrderType(String) - Method in class org.apache.spark.ml.feature.RFormula
-
- setStringOrderType(String) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.clustering.LDA
-
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setSubsamplingRate(double) - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams
-
- setSubsamplingRate(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setTau0(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
A (positive) learning parameter that downweights early iterations.
- setTestMethod(String) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-
Set the statistical method used for significance testing.
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LinearSVC
-
Set threshold in binary classification.
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LinearSVCModel
-
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- setThreshold(double) - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
Set threshold in binary classification, in range [0, 1].
- setThreshold(double) - Method in class org.apache.spark.ml.feature.Binarizer
-
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
Sets the threshold that separates positive predictions from negative predictions
in Binary Logistic Regression.
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel
-
Sets the threshold that separates positive predictions from negative predictions.
- setThresholds(double[]) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- setThresholds(double[]) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- setThresholds(double[]) - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
-
Set thresholds in multiclass (or binary) classification to adjust the probability of
predicting each class.
- setThresholds(double[]) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-
- setThresholds(double[]) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
-
- setTimeoutDuration(long) - Method in interface org.apache.spark.sql.streaming.GroupState
-
Set the timeout duration in ms for this key.
- setTimeoutDuration(String) - Method in interface org.apache.spark.sql.streaming.GroupState
-
Set the timeout duration for this key as a string.
- setTimeoutTimestamp(long) - Method in interface org.apache.spark.sql.streaming.GroupState
-
Set the timeout timestamp for this key as milliseconds in epoch time.
- setTimeoutTimestamp(long, String) - Method in interface org.apache.spark.sql.streaming.GroupState
-
Set the timeout timestamp for this key as milliseconds in epoch time and an additional
duration as a string (e.g.
- setTimeoutTimestamp(Date) - Method in interface org.apache.spark.sql.streaming.GroupState
-
Set the timeout timestamp for this key as a java.sql.Date.
- setTimeoutTimestamp(Date, String) - Method in interface org.apache.spark.sql.streaming.GroupState
-
Set the timeout timestamp for this key as a java.sql.Date and an additional
duration as a string (e.g.
- setTol(double) - Method in class org.apache.spark.ml.classification.LinearSVC
-
Set the convergence tolerance of iterations.
- setTol(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the convergence tolerance of iterations.
- setTol(double) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
-
Set the convergence tolerance of iterations.
- setTol(double) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- setTol(double) - Method in class org.apache.spark.ml.clustering.KMeans
-
- setTol(double) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
Set the convergence tolerance of iterations.
- setTol(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the convergence tolerance of iterations.
- setTol(double) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set the convergence tolerance of iterations.
- setToLowercase(boolean) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- setTopicConcentration(double) - Method in class org.apache.spark.ml.clustering.LDA
-
- setTopicConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
distributions over terms.
- setTopicDistributionCol(String) - Method in class org.apache.spark.ml.clustering.LDA
-
- setTopicDistributionCol(String) - Method in class org.apache.spark.ml.clustering.LDAModel
-
- setTrainRatio(double) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- setTreeStrategy(Strategy) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setUiRoot(ContextHandler, UIRoot) - Static method in class org.apache.spark.status.api.v1.UIRootFromServletContext
-
- setupCommitter(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapRedCommitProtocol
-
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the updater function to actually perform a gradient step in a given direction.
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the updater function to actually perform a gradient step in a given direction.
- SetupDriver(org.apache.spark.rpc.RpcEndpointRef) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SetupDriver
-
- SetupDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SetupDriver$
-
- setupGroups(int, DefaultPartitionCoalescer.PartitionLocations) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
Initializes targetLen partition groups.
- setupJob(JobContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol
-
Setups up a job.
- setupJob(JobContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
-
- setUpperBoundsOnCoefficients(Matrix) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the upper bounds on coefficients if fitting under bound constrained optimization.
- setUpperBoundsOnIntercepts(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the upper bounds on intercepts if fitting under bound constrained optimization.
- setupTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol
-
Sets up a task within a job.
- setupTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
-
- setupUI(org.apache.spark.ui.SparkUI) - Method in interface org.apache.spark.status.AppHistoryServerPlugin
-
Sets up UI of this plugin to rebuild the history UI.
- setUseNodeIdCache(boolean) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of user blocks to parallelize the computation.
- setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Set if the algorithm should validate data before training.
- setValidationIndicatorCol(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setValidationIndicatorCol(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setValidationTol(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setValue(R) - Method in class org.apache.spark.Accumulable
-
Deprecated.
Set the accumulator's value.
- setVarianceCol(String) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- setVarianceCol(String) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setVariancePower(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the value of param variancePower
.
- setVectorSize(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets vector size (default: 100).
- setVerbose(boolean) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Enables verbose reporting for SparkSubmit.
- setVerbose(boolean) - Method in class org.apache.spark.launcher.SparkLauncher
-
- setVocabSize(int) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- setWeightCol(String) - Method in class org.apache.spark.ml.classification.LinearSVC
-
Set the value of param weightCol
.
- setWeightCol(double) - Method in class org.apache.spark.ml.classification.LinearSVCModel
-
- setWeightCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Sets the value of param weightCol
.
- setWeightCol(String) - Method in class org.apache.spark.ml.classification.NaiveBayes
-
Sets the value of param weightCol
.
- setWeightCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
-
Sets the value of param weightCol
.
- setWeightCol(String) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
-
- setWeightCol(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
-
Sets the value of param weightCol
.
- setWeightCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-
- setWeightCol(String) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Whether to over-/under-sample training instances according to the given weights in weightCol.
- setWindowSize(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setWindowSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets the window of words (default: 5)
- setWindowSize(int) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
-
Set the number of batches to compute significance tests over.
- setWithMean(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setWithMean(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
Developer API
- setWithStd(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setWithStd(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
Developer API
- sha1(Column) - Static method in class org.apache.spark.sql.functions
-
Calculates the SHA-1 digest of a binary column and returns the value
as a 40 character hex string.
- sha2(Column, int) - Static method in class org.apache.spark.sql.functions
-
Calculates the SHA-2 family of hash functions of a binary column and
returns the value as a hex string.
- shape() - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- SharedParamsCodeGen - Class in org.apache.spark.ml.param.shared
-
Code generator for shared params (sharedParams.scala).
- SharedParamsCodeGen() - Constructor for class org.apache.spark.ml.param.shared.SharedParamsCodeGen
-
- SharedReadWrite$() - Constructor for class org.apache.spark.ml.Pipeline.SharedReadWrite$
-
- sharedState() - Method in class org.apache.spark.sql.SparkSession
-
State shared across sessions, including the SparkContext
, cached data, listener,
and a catalog that interacts with external systems.
- shiftLeft(Column, int) - Static method in class org.apache.spark.sql.functions
-
Shift the given value numBits left.
- shiftRight(Column, int) - Static method in class org.apache.spark.sql.functions
-
(Signed) shift the given value numBits right.
- shiftRightUnsigned(Column, int) - Static method in class org.apache.spark.sql.functions
-
Unsigned shift the given value numBits right.
- SHORT() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for nullable short type.
- ShortestPaths - Class in org.apache.spark.graphx.lib
-
Computes shortest paths to the given set of landmark vertices, returning a graph where each
vertex attribute is a map containing the shortest-path distance to each reachable landmark.
- ShortestPaths() - Constructor for class org.apache.spark.graphx.lib.ShortestPaths
-
- shortName() - Method in interface org.apache.spark.ml.util.MLFormatRegister
-
- shortName() - Method in class org.apache.spark.sql.hive.execution.HiveFileFormat
-
- shortName() - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- shortName() - Method in interface org.apache.spark.sql.sources.DataSourceRegister
-
The string that represents the format that this data source provider uses.
- shortTimeUnitString(TimeUnit) - Static method in class org.apache.spark.streaming.ui.UIUtils
-
Return the short string for a TimeUnit
.
- ShortType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the ShortType object.
- ShortType - Class in org.apache.spark.sql.types
-
The data type representing Short
values.
- ShortType() - Constructor for class org.apache.spark.sql.types.ShortType
-
- shortVersion(String) - Static method in class org.apache.spark.util.VersionUtils
-
Given a Spark version string, return the short version string.
- shouldCloseFileAfterWrite(SparkConf, boolean) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
-
- shouldDistributeGaussians(int, int) - Static method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Heuristic to distribute the computation of the MultivariateGaussian
s, approximately when
d is greater than 25 except for when k is very small.
- shouldGoLeft(Vector) - Method in interface org.apache.spark.ml.tree.Split
-
Return true (split to left) or false (split to right).
- shouldGoLeft(int, Split[]) - Method in interface org.apache.spark.ml.tree.Split
-
Return true (split to left) or false (split to right).
- shouldOwn(Param<?>) - Method in interface org.apache.spark.ml.param.Params
-
Validates that the input param belongs to this instance.
- shouldRollover(long) - Method in interface org.apache.spark.util.logging.RollingPolicy
-
Whether rollover should be initiated at this moment
- show(int) - Method in class org.apache.spark.sql.Dataset
-
Displays the Dataset in a tabular form.
- show() - Method in class org.apache.spark.sql.Dataset
-
Displays the top 20 rows of Dataset in a tabular form.
- show(boolean) - Method in class org.apache.spark.sql.Dataset
-
Displays the top 20 rows of Dataset in a tabular form.
- show(int, boolean) - Method in class org.apache.spark.sql.Dataset
-
Displays the Dataset in a tabular form.
- show(int, int) - Method in class org.apache.spark.sql.Dataset
-
Displays the Dataset in a tabular form.
- show(int, int, boolean) - Method in class org.apache.spark.sql.Dataset
-
Displays the Dataset in a tabular form.
- showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Object>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showBytesDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showBytesDistribution(String, org.apache.spark.util.Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDagVizForJob(int, Seq<org.apache.spark.ui.scope.RDDOperationGraph>) - Static method in class org.apache.spark.ui.UIUtils
-
Return a "DAG visualization" DOM element that expands into a visualization for a job.
- showDagVizForStage(int, Option<org.apache.spark.ui.scope.RDDOperationGraph>) - Static method in class org.apache.spark.ui.UIUtils
-
Return a "DAG visualization" DOM element that expands into a visualization for a stage.
- showDistribution(String, org.apache.spark.util.Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, Option<org.apache.spark.util.Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, Option<org.apache.spark.util.Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Object>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Object>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-
- shuffle(Column) - Static method in class org.apache.spark.sql.functions
-
Returns a random permutation of the given array.
- SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
-
- SHUFFLE_DATA() - Static method in class org.apache.spark.storage.BlockId
-
- SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
-
- SHUFFLE_LOCAL_BLOCKS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_READ() - Static method in class org.apache.spark.ui.ToolTips
-
- SHUFFLE_READ_BLOCKED_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- SHUFFLE_READ_BLOCKED_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- SHUFFLE_READ_METRICS_PREFIX() - Static method in class org.apache.spark.InternalAccumulator
-
- SHUFFLE_READ_RECORDS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_READ_REMOTE_SIZE() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- SHUFFLE_READ_REMOTE_SIZE() - Static method in class org.apache.spark.ui.ToolTips
-
- SHUFFLE_READ_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_REMOTE_BLOCKS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_REMOTE_READS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_REMOTE_READS_TO_DISK() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_TOTAL_BLOCKS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_TOTAL_READS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_WRITE() - Static method in class org.apache.spark.ui.ToolTips
-
- SHUFFLE_WRITE_METRICS_PREFIX() - Static method in class org.apache.spark.InternalAccumulator
-
- SHUFFLE_WRITE_RECORDS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_WRITE_SIZE() - Static method in class org.apache.spark.status.TaskIndexNames
-
- SHUFFLE_WRITE_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
-
- ShuffleBlockId - Class in org.apache.spark.storage
-
- ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
-
- shuffleCleaned(int) - Method in interface org.apache.spark.CleanerListener
-
- ShuffleDataBlockId - Class in org.apache.spark.storage
-
- ShuffleDataBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleDataBlockId
-
- ShuffleDependency<K,V,C> - Class in org.apache.spark
-
Developer API
Represents a dependency on the output of a shuffle stage.
- ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Serializer, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.ShuffleDependency
-
- ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd
-
Developer API
The resulting RDD from a shuffle (e.g.
- ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.rdd.ShuffledRDD
-
- shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
-
- shuffleId() - Method in class org.apache.spark.CleanShuffle
-
- shuffleId() - Method in class org.apache.spark.FetchFailed
-
- shuffleId() - Method in class org.apache.spark.ShuffleDependency
-
- shuffleId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- ShuffleIndexBlockId - Class in org.apache.spark.storage
-
- ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
-
- shuffleManager() - Method in class org.apache.spark.SparkEnv
-
- shuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- shuffleRead$() - Constructor for class org.apache.spark.InternalAccumulator.shuffleRead$
-
- shuffleReadBytes() - Method in class org.apache.spark.status.api.v1.StageData
-
- ShuffleReadMetricDistributions - Class in org.apache.spark.status.api.v1
-
- ShuffleReadMetrics - Class in org.apache.spark.status.api.v1
-
- shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- shuffleReadRecords() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- shuffleReadRecords() - Method in class org.apache.spark.status.api.v1.StageData
-
- ShuffleStatus - Class in org.apache.spark
-
Helper class used by the MapOutputTrackerMaster
to perform bookkeeping for a single
ShuffleMapStage.
- ShuffleStatus(int) - Constructor for class org.apache.spark.ShuffleStatus
-
- shuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- shuffleWrite$() - Constructor for class org.apache.spark.InternalAccumulator.shuffleWrite$
-
- shuffleWriteBytes() - Method in class org.apache.spark.status.api.v1.StageData
-
- ShuffleWriteMetricDistributions - Class in org.apache.spark.status.api.v1
-
- ShuffleWriteMetrics - Class in org.apache.spark.status.api.v1
-
- shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- shuffleWriteRecords() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- shuffleWriteRecords() - Method in class org.apache.spark.status.api.v1.StageData
-
- shutdown() - Method in interface org.apache.spark.ExecutorPlugin
-
Clean up and terminate this plugin.
- shutdown(ExecutorService, Duration) - Static method in class org.apache.spark.util.ThreadUtils
-
- Shutdown$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.Shutdown$
-
- ShutdownHookManager - Class in org.apache.spark.util
-
Various utility methods used by Spark.
- ShutdownHookManager() - Constructor for class org.apache.spark.util.ShutdownHookManager
-
- sigma() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
- sigmas() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- SignalUtils - Class in org.apache.spark.util
-
Contains utilities for working with posix signals.
- SignalUtils() - Constructor for class org.apache.spark.util.SignalUtils
-
- signum(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the signum of the given value.
- signum(String) - Static method in class org.apache.spark.sql.functions
-
Computes the signum of the given column.
- SimpleFutureAction<T> - Class in org.apache.spark
-
A
FutureAction
holding the result of an action that triggers a single job.
- simpleString() - Method in class org.apache.spark.sql.types.ArrayType
-
- simpleString() - Static method in class org.apache.spark.sql.types.BinaryType
-
- simpleString() - Static method in class org.apache.spark.sql.types.BooleanType
-
- simpleString() - Method in class org.apache.spark.sql.types.ByteType
-
- simpleString() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- simpleString() - Method in class org.apache.spark.sql.types.CharType
-
- simpleString() - Method in class org.apache.spark.sql.types.DataType
-
Readable string representation for the type.
- simpleString() - Static method in class org.apache.spark.sql.types.DateType
-
- simpleString() - Method in class org.apache.spark.sql.types.DecimalType
-
- simpleString() - Static method in class org.apache.spark.sql.types.DoubleType
-
- simpleString() - Static method in class org.apache.spark.sql.types.FloatType
-
- simpleString() - Method in class org.apache.spark.sql.types.IntegerType
-
- simpleString() - Method in class org.apache.spark.sql.types.LongType
-
- simpleString() - Method in class org.apache.spark.sql.types.MapType
-
- simpleString() - Static method in class org.apache.spark.sql.types.NullType
-
- simpleString() - Method in class org.apache.spark.sql.types.ObjectType
-
- simpleString() - Method in class org.apache.spark.sql.types.ShortType
-
- simpleString() - Static method in class org.apache.spark.sql.types.StringType
-
- simpleString() - Method in class org.apache.spark.sql.types.StructType
-
- simpleString() - Static method in class org.apache.spark.sql.types.TimestampType
-
- simpleString() - Method in class org.apache.spark.sql.types.VarcharType
-
- SimpleUpdater - Class in org.apache.spark.mllib.optimization
-
Developer API
A simple updater for gradient descent *without* any regularization.
- SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
-
- sin(Column) - Static method in class org.apache.spark.sql.functions
-
- sin(String) - Static method in class org.apache.spark.sql.functions
-
- SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg
-
Represents singular value decomposition (SVD) factors.
- SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- sinh(Column) - Static method in class org.apache.spark.sql.functions
-
- sinh(String) - Static method in class org.apache.spark.sql.functions
-
- Sink - Interface in org.apache.spark.metrics.sink
-
- sink() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
- SinkProgress - Class in org.apache.spark.sql.streaming
-
Information about progress made for a sink in the execution of a
StreamingQuery
during a trigger.
- size() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
-
- size() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Size of the attribute group.
- size() - Method in class org.apache.spark.ml.feature.VectorSizeHint
-
The size of Vectors in inputCol
.
- size() - Method in class org.apache.spark.ml.linalg.DenseVector
-
- size() - Method in class org.apache.spark.ml.linalg.SparseVector
-
- size() - Method in interface org.apache.spark.ml.linalg.Vector
-
Size of the vector.
- size() - Method in class org.apache.spark.ml.param.ParamMap
-
Number of param pairs in this map.
- size() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- size() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- size() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Size of the vector.
- size(Column) - Static method in class org.apache.spark.sql.functions
-
Returns length of array or map.
- size() - Method in interface org.apache.spark.sql.Row
-
Number of elements in the Row.
- size() - Method in interface org.apache.spark.storage.BlockData
-
- size() - Method in class org.apache.spark.storage.DiskBlockData
-
- size() - Method in class org.apache.spark.storage.memory.DeserializedMemoryEntry
-
- size() - Method in interface org.apache.spark.storage.memory.MemoryEntry
-
- size() - Method in class org.apache.spark.storage.memory.SerializedMemoryEntry
-
- SizeEstimator - Class in org.apache.spark.util
-
Developer API
Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in
memory-aware caches.
- SizeEstimator() - Constructor for class org.apache.spark.util.SizeEstimator
-
- sizeInBytes() - Method in class org.apache.spark.sql.sources.BaseRelation
-
Returns an estimated size of this relation in bytes.
- sizeInBytes() - Method in interface org.apache.spark.sql.sources.v2.reader.Statistics
-
- sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
-
Sketches the input RDD via reservoir sampling on each partition.
- skewness(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the skewness of the values in a group.
- skewness(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the skewness of the values in a group.
- skip(long) - Method in class org.apache.spark.io.NioBufferedFileInputStream
-
- skip(long) - Method in class org.apache.spark.io.ReadAheadInputStream
-
- skip(long) - Method in class org.apache.spark.storage.BufferReleasingInputStream
-
- skippedStages() - Method in class org.apache.spark.status.LiveJob
-
- skippedTasks() - Method in class org.apache.spark.status.LiveJob
-
- skipWhitespace() - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- slice(Column, int, int) - Static method in class org.apache.spark.sql.functions
-
Returns an array containing all the elements in x
from index start
(or starting from the
end if start
is negative) with the specified length
.
- slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
- slice(org.apache.spark.streaming.Interval) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return all the RDDs defined by the Interval object (both end times included)
- slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return all the RDDs between 'fromTime' to 'toTime' (both included)
- slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
Time interval after which the DStream generates an RDD
- slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- sliding(int, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
-
Returns an RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding
window over them.
- sliding(int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
-
sliding(Int, Int)*
with step = 1.
- smoothing() - Method in interface org.apache.spark.ml.classification.NaiveBayesParams
-
The smoothing parameter.
- SnappyCompressionCodec - Class in org.apache.spark.io
-
- SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
-
- SnappyOutputStreamWrapper - Class in org.apache.spark.io
-
Wrapper over SnappyOutputStream
which guards against write-after-close and double-close
issues.
- SnappyOutputStreamWrapper(SnappyOutputStream) - Constructor for class org.apache.spark.io.SnappyOutputStreamWrapper
-
- socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Creates an input stream from TCP source hostname:port.
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext
-
Creates an input stream from TCP source hostname:port.
- solve(double, double, DenseVector, DenseVector, DenseVector) - Method in interface org.apache.spark.ml.optim.NormalEquationSolver
-
Solve the normal equations from summary statistics.
- solve(ALS.NormalEquation, double) - Method in interface org.apache.spark.ml.recommendation.ALS.LeastSquaresNESolver
-
Solves a least squares problem with regularization (possibly with other constraints).
- solve(double[], double[]) - Static method in class org.apache.spark.mllib.linalg.CholeskyDecomposition
-
Solves a symmetric positive definite linear system via Cholesky factorization.
- solve(double[], double[], NNLS.Workspace) - Static method in class org.apache.spark.mllib.optimization.NNLS
-
Solve a least squares problem, possibly with nonnegativity constraints, by a modified
projected gradient method.
- solver() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams
-
The solver algorithm for optimization.
- solver() - Method in interface org.apache.spark.ml.param.shared.HasSolver
-
Param for the solver algorithm for optimization.
- solver() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
-
The solver algorithm for optimization.
- solver() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary
-
- solver() - Method in interface org.apache.spark.ml.regression.LinearRegressionParams
-
The solver algorithm for optimization.
- Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- sort(String, String...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset sorted by the specified column, all in ascending order.
- sort(Column...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset sorted by the given expressions.
- sort(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset sorted by the specified column, all in ascending order.
- sort(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset sorted by the given expressions.
- sort_array(Column) - Static method in class org.apache.spark.sql.functions
-
Sorts the input array for the given column in ascending order,
according to the natural ordering of the array elements.
- sort_array(Column, boolean) - Static method in class org.apache.spark.sql.functions
-
Sorts the input array for the given column in ascending or descending order,
according to the natural ordering of the array elements.
- sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return this RDD sorted by the given key function.
- sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return this RDD sorted by the given key function.
- sortBy(String, String...) - Method in class org.apache.spark.sql.DataFrameWriter
-
Sorts the output in each bucket by the given columns.
- sortBy(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameWriter
-
Sorts the output in each bucket by the given columns.
- sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements in
ascending order.
- sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortWithinPartitions(String, String...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with each partition sorted by the given expressions.
- sortWithinPartitions(Column...) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with each partition sorted by the given expressions.
- sortWithinPartitions(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with each partition sorted by the given expressions.
- sortWithinPartitions(Seq<Column>) - Method in class org.apache.spark.sql.Dataset
-
Returns a new Dataset with each partition sorted by the given expressions.
- soundex(Column) - Static method in class org.apache.spark.sql.functions
-
Returns the soundex code for the specified expression.
- Source - Interface in org.apache.spark.metrics.source
-
- sourceName() - Static method in class org.apache.spark.metrics.source.CodegenMetrics
-
- sourceName() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
-
- sourceName() - Method in interface org.apache.spark.metrics.source.Source
-
- SourceProgress - Class in org.apache.spark.sql.streaming
-
Information about progress made for a source in the execution of a
StreamingQuery
during a trigger.
- sources() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
- sourceSchema(SQLContext, Option<StructType>, String, Map<String, String>) - Method in interface org.apache.spark.sql.sources.StreamSourceProvider
-
Returns the name and schema of the source that can be used to continually read data.
- spark() - Method in class org.apache.spark.status.api.v1.VersionInfo
-
- SPARK_CONNECTOR_NAME() - Static method in class org.apache.spark.ui.JettyUtils
-
- SPARK_CONTEXT_SHUTDOWN_PRIORITY() - Static method in class org.apache.spark.util.ShutdownHookManager
-
The shutdown priority of the SparkContext instance.
- SPARK_IO_ENCRYPTION_COMMONS_CONFIG_PREFIX() - Static method in class org.apache.spark.security.CryptoStreamUtils
-
- SPARK_MASTER - Static variable in class org.apache.spark.launcher.SparkLauncher
-
The Spark master.
- spark_partition_id() - Static method in class org.apache.spark.sql.functions
-
Partition ID.
- SPARK_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
-
- SparkAppConfig(Seq<Tuple2<String, String>>, Option<byte[]>, Option<byte[]>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig
-
- SparkAppConfig$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig$
-
- SparkAppHandle - Interface in org.apache.spark.launcher
-
A handle to a running Spark application.
- SparkAppHandle.Listener - Interface in org.apache.spark.launcher
-
Listener for updates to a handle's state.
- SparkAppHandle.State - Enum in org.apache.spark.launcher
-
Represents the application's state.
- SparkAWSCredentials - Interface in org.apache.spark.streaming.kinesis
-
Serializable interface providing a method executors can call to obtain an
AWSCredentialsProvider instance for authenticating to AWS services.
- SparkAWSCredentials.Builder - Class in org.apache.spark.streaming.kinesis
-
- SparkConf - Class in org.apache.spark
-
Configuration for a Spark application.
- SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
-
- SparkConf() - Constructor for class org.apache.spark.SparkConf
-
Create a SparkConf that loads defaults from system properties and the classpath
- sparkContext() - Method in class org.apache.spark.rdd.RDD
-
The SparkContext that created this RDD.
- SparkContext - Class in org.apache.spark
-
Main entry point for Spark functionality.
- SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
-
- SparkContext() - Constructor for class org.apache.spark.SparkContext
-
Create a SparkContext that loads settings from system properties (for instance, when
launching with ./bin/spark-submit).
- SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- SparkContext(String, String, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- sparkContext() - Method in class org.apache.spark.sql.SparkSession
-
- sparkContext() - Method in class org.apache.spark.sql.SQLContext
-
- sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
The underlying SparkContext
- sparkContext() - Method in class org.apache.spark.streaming.StreamingContext
-
Return the associated Spark context
- SparkEnv - Class in org.apache.spark
-
Developer API
Holds all the runtime environment objects for a running Spark instance (either master or worker),
including the serializer, RpcEnv, block manager, map output tracker, etc.
- SparkEnv(String, org.apache.spark.rpc.RpcEnv, Serializer, Serializer, org.apache.spark.serializer.SerializerManager, MapOutputTracker, ShuffleManager, org.apache.spark.broadcast.BroadcastManager, org.apache.spark.storage.BlockManager, SecurityManager, org.apache.spark.metrics.MetricsSystem, MemoryManager, org.apache.spark.scheduler.OutputCommitCoordinator, SparkConf) - Constructor for class org.apache.spark.SparkEnv
-
- sparkEventFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- sparkEventToJson(SparkListenerEvent) - Static method in class org.apache.spark.util.JsonProtocol
-
------------------------------------------------- *
JSON serialization methods for SparkListenerEvents |
- SparkException - Exception in org.apache.spark
-
- SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
-
- SparkException(String) - Constructor for exception org.apache.spark.SparkException
-
- SparkExecutorInfo - Interface in org.apache.spark
-
Exposes information about Spark Executors.
- SparkExecutorInfoImpl - Class in org.apache.spark
-
- SparkExecutorInfoImpl(String, int, long, int, long, long, long, long) - Constructor for class org.apache.spark.SparkExecutorInfoImpl
-
- SparkExitCode - Class in org.apache.spark.util
-
- SparkExitCode() - Constructor for class org.apache.spark.util.SparkExitCode
-
- SparkFiles - Class in org.apache.spark
-
Resolves paths to files added through SparkContext.addFile()
.
- SparkFiles() - Constructor for class org.apache.spark.SparkFiles
-
- SparkFirehoseListener - Class in org.apache.spark
-
Class that allows users to receive all SparkListener events.
- SparkFirehoseListener() - Constructor for class org.apache.spark.SparkFirehoseListener
-
- SparkHadoopMapRedUtil - Class in org.apache.spark.mapred
-
- SparkHadoopMapRedUtil() - Constructor for class org.apache.spark.mapred.SparkHadoopMapRedUtil
-
- SparkHadoopWriter - Class in org.apache.spark.internal.io
-
A helper object that saves an RDD using a Hadoop OutputFormat.
- SparkHadoopWriter() - Constructor for class org.apache.spark.internal.io.SparkHadoopWriter
-
- SparkHadoopWriterUtils - Class in org.apache.spark.internal.io
-
A helper object that provide common utils used during saving an RDD using a Hadoop OutputFormat
(both from the old mapred API and the new mapreduce API)
- SparkHadoopWriterUtils() - Constructor for class org.apache.spark.internal.io.SparkHadoopWriterUtils
-
- sparkJavaOpts(SparkConf, Function1<String, Object>) - Static method in class org.apache.spark.util.Utils
-
Convert all spark properties set in the given SparkConf to a sequence of java options.
- SparkJobInfo - Interface in org.apache.spark
-
Exposes information about Spark Jobs.
- SparkJobInfoImpl - Class in org.apache.spark
-
- SparkJobInfoImpl(int, int[], JobExecutionStatus) - Constructor for class org.apache.spark.SparkJobInfoImpl
-
- SparkLauncher - Class in org.apache.spark.launcher
-
Launcher for Spark applications.
- SparkLauncher() - Constructor for class org.apache.spark.launcher.SparkLauncher
-
- SparkLauncher(Map<String, String>) - Constructor for class org.apache.spark.launcher.SparkLauncher
-
Creates a launcher that will set the given environment variables in the child.
- SparkListener - Class in org.apache.spark.scheduler
-
Developer API
A default implementation for SparkListenerInterface
that has no-op implementations for
all callbacks.
- SparkListener() - Constructor for class org.apache.spark.scheduler.SparkListener
-
- SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
-
- SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- SparkListenerApplicationStart - Class in org.apache.spark.scheduler
-
- SparkListenerApplicationStart(String, Option<String>, long, String, Option<String>, Option<Map<String, String>>) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
-
- SparkListenerBlockManagerAdded(long, BlockManagerId, long, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
-
- SparkListenerBlockManagerRemoved(long, BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-
- SparkListenerBlockUpdated - Class in org.apache.spark.scheduler
-
- SparkListenerBlockUpdated(BlockUpdatedInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockUpdated
-
- SparkListenerBus - Interface in org.apache.spark.scheduler
-
- SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
-
- SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
-
- SparkListenerEvent - Interface in org.apache.spark.scheduler
-
- SparkListenerExecutorAdded - Class in org.apache.spark.scheduler
-
- SparkListenerExecutorAdded(long, String, ExecutorInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorAdded
-
- SparkListenerExecutorBlacklisted - Class in org.apache.spark.scheduler
-
- SparkListenerExecutorBlacklisted(long, String, int) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
-
- SparkListenerExecutorBlacklistedForStage - Class in org.apache.spark.scheduler
-
- SparkListenerExecutorBlacklistedForStage(long, String, int, int, int) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
-
- SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler
-
Periodic updates from executors.
- SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, Seq<AccumulableInfo>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- SparkListenerExecutorRemoved - Class in org.apache.spark.scheduler
-
- SparkListenerExecutorRemoved(long, String, String) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- SparkListenerExecutorUnblacklisted - Class in org.apache.spark.scheduler
-
- SparkListenerExecutorUnblacklisted(long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
-
- SparkListenerInterface - Interface in org.apache.spark.scheduler
-
Interface for listening to events from the Spark scheduler.
- SparkListenerJobEnd - Class in org.apache.spark.scheduler
-
- SparkListenerJobEnd(int, long, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
-
- SparkListenerJobStart - Class in org.apache.spark.scheduler
-
- SparkListenerJobStart(int, long, Seq<StageInfo>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
-
- SparkListenerLogStart - Class in org.apache.spark.scheduler
-
An internal class that describes the metadata of an event log.
- SparkListenerLogStart(String) - Constructor for class org.apache.spark.scheduler.SparkListenerLogStart
-
- SparkListenerNodeBlacklisted - Class in org.apache.spark.scheduler
-
- SparkListenerNodeBlacklisted(long, String, int) - Constructor for class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
-
- SparkListenerNodeBlacklistedForStage - Class in org.apache.spark.scheduler
-
- SparkListenerNodeBlacklistedForStage(long, String, int, int, int) - Constructor for class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
-
- SparkListenerNodeUnblacklisted - Class in org.apache.spark.scheduler
-
- SparkListenerNodeUnblacklisted(long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
-
- SparkListenerSpeculativeTaskSubmitted - Class in org.apache.spark.scheduler
-
- SparkListenerSpeculativeTaskSubmitted(int) - Constructor for class org.apache.spark.scheduler.SparkListenerSpeculativeTaskSubmitted
-
- SparkListenerStageCompleted - Class in org.apache.spark.scheduler
-
- SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
-
- SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- SparkListenerTaskEnd - Class in org.apache.spark.scheduler
-
- SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
-
- SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- SparkListenerTaskStart - Class in org.apache.spark.scheduler
-
- SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
-
- SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
-
- SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- SparkMasterRegex - Class in org.apache.spark
-
A collection of regexes for extracting information from the master string.
- SparkMasterRegex() - Constructor for class org.apache.spark.SparkMasterRegex
-
- sparkProperties() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig
-
- sparkProperties() - Method in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo
-
- SparkRDefaults - Class in org.apache.spark.api.r
-
- SparkRDefaults() - Constructor for class org.apache.spark.api.r.SparkRDefaults
-
- sparkRPackagePath(boolean) - Static method in class org.apache.spark.api.r.RUtils
-
Get the list of paths for R packages in various deployment modes, of which the first
path is for the SparkR package itself.
- sparkSession() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
-
- sparkSession() - Method in interface org.apache.spark.ml.util.BaseReadWrite
-
Returns the user-specified Spark Session or the default.
- sparkSession() - Method in class org.apache.spark.sql.Dataset
-
- sparkSession() - Method in interface org.apache.spark.sql.hive.HiveStrategies
-
- SparkSession - Class in org.apache.spark.sql
-
The entry point to programming Spark with the Dataset and DataFrame API.
- sparkSession() - Method in class org.apache.spark.sql.SQLContext
-
- sparkSession() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
Returns the SparkSession
associated with this
.
- SparkSession.Builder - Class in org.apache.spark.sql
-
- SparkSession.implicits$ - Class in org.apache.spark.sql
-
Experimental
(Scala-specific) Implicit methods available in Scala for converting
common Scala objects into DataFrame
s.
- SparkSessionExtensions - Class in org.apache.spark.sql
-
Experimental
Holder for injection points to the
SparkSession
.
- SparkSessionExtensions() - Constructor for class org.apache.spark.sql.SparkSessionExtensions
-
- SparkShutdownHook - Class in org.apache.spark.util
-
- SparkShutdownHook(int, Function0<BoxedUnit>) - Constructor for class org.apache.spark.util.SparkShutdownHook
-
- SparkStageInfo - Interface in org.apache.spark
-
Exposes information about Spark Stages.
- SparkStageInfoImpl - Class in org.apache.spark
-
- SparkStageInfoImpl(int, int, long, String, int, int, int, int) - Constructor for class org.apache.spark.SparkStageInfoImpl
-
- SparkStatusTracker - Class in org.apache.spark
-
Low-level status reporting APIs for monitoring job and stage progress.
- sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- sparkUser() - Method in class org.apache.spark.SparkContext
-
- sparkUser() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- SparkUserDefinedFunction - Class in org.apache.spark.sql.expressions
-
- SparkUserDefinedFunction() - Constructor for class org.apache.spark.sql.expressions.SparkUserDefinedFunction
-
- sparkVersion() - Method in class org.apache.spark.scheduler.SparkListenerLogStart
-
- sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.ml.linalg.Matrices
-
Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.
- sparse(int, int[], double[]) - Static method in class org.apache.spark.ml.linalg.Vectors
-
Creates a sparse vector providing its index array and value array.
- sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.ml.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs.
- sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.ml.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
- sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.
- sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector providing its index array and value array.
- sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs.
- sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
- SparseMatrix - Class in org.apache.spark.ml.linalg
-
Column-major sparse matrix.
- SparseMatrix(int, int, int[], int[], double[], boolean) - Constructor for class org.apache.spark.ml.linalg.SparseMatrix
-
- SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.ml.linalg.SparseMatrix
-
Column-major sparse matrix.
- SparseMatrix - Class in org.apache.spark.mllib.linalg
-
Column-major sparse matrix.
- SparseMatrix(int, int, int[], int[], double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
-
- SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
-
Column-major sparse matrix.
- SparseVector - Class in org.apache.spark.ml.linalg
-
A sparse vector represented by an index array and a value array.
- SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.ml.linalg.SparseVector
-
- SparseVector - Class in org.apache.spark.mllib.linalg
-
A sparse vector represented by an index array and a value array.
- SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
-
- SPARSITY() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
-
- sparsity() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- spdiag(Vector) - Static method in class org.apache.spark.ml.linalg.SparseMatrix
-
Generate a diagonal matrix in SparseMatrix
format from the supplied values.
- spdiag(Vector) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a diagonal matrix in SparseMatrix
format from the supplied values.
- SpearmanCorrelation - Class in org.apache.spark.mllib.stat.correlation
-
Compute Spearman's correlation for two RDDs of the type RDD[Double] or the correlation matrix
for an RDD of the type RDD[Vector].
- SpearmanCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
-
- SpecialLengths - Class in org.apache.spark.api.r
-
- SpecialLengths() - Constructor for class org.apache.spark.api.r.SpecialLengths
-
- speculative() - Method in class org.apache.spark.scheduler.TaskInfo
-
- speculative() - Method in class org.apache.spark.status.api.v1.TaskData
-
- speye(int) - Static method in class org.apache.spark.ml.linalg.Matrices
-
Generate a sparse Identity Matrix in Matrix
format.
- speye(int) - Static method in class org.apache.spark.ml.linalg.SparseMatrix
-
Generate an Identity Matrix in SparseMatrix
format.
- speye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a sparse Identity Matrix in Matrix
format.
- speye(int) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate an Identity Matrix in SparseMatrix
format.
- SpillListener - Class in org.apache.spark
-
A SparkListener
that detects whether spills have occurred in Spark jobs.
- SpillListener() - Constructor for class org.apache.spark.SpillListener
-
- split() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
-
- split() - Method in class org.apache.spark.ml.tree.InternalNode
-
- Split - Interface in org.apache.spark.ml.tree
-
Interface for a "Split," which specifies a test made at a decision tree node
to choose the left or right path.
- split() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- split() - Method in class org.apache.spark.mllib.tree.model.Node
-
- Split - Class in org.apache.spark.mllib.tree.model
-
Developer API
Split applied to a feature
param: feature feature index
param: threshold Threshold for continuous feature.
- Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
-
- split(Column, String) - Static method in class org.apache.spark.sql.functions
-
Splits str around pattern (pattern is a regular expression).
- splitAndCountPartitions(Iterator<String>) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
Splits lines and counts the words.
- splitCommandString(String) - Static method in class org.apache.spark.util.Utils
-
Split a string of potentially quoted arguments from the command line the way that a shell
would do it to determine arguments to a command.
- SplitData(int, double[], int) - Constructor for class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData
-
- SplitData(int, double, int, Seq<Object>) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- SplitData$() - Constructor for class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData$
-
- SplitData$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData$
-
- splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
-
- SplitInfo - Class in org.apache.spark.scheduler
-
- SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
-
- splits() - Method in class org.apache.spark.ml.feature.Bucketizer
-
Parameter for mapping continuous features into buckets.
- splitsArray() - Method in class org.apache.spark.ml.feature.Bucketizer
-
Parameter for specifying multiple splits parameters.
- spr(double, Vector, DenseVector) - Static method in class org.apache.spark.ml.linalg.BLAS
-
Adds alpha * x * x.t to a matrix in-place.
- spr(double, Vector, double[]) - Static method in class org.apache.spark.ml.linalg.BLAS
-
Adds alpha * x * x.t to a matrix in-place.
- spr(double, Vector, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
Adds alpha * v * v.t to a matrix in-place.
- spr(double, Vector, double[]) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
Adds alpha * v * v.t to a matrix in-place.
- sprand(int, int, double, Random) - Static method in class org.apache.spark.ml.linalg.Matrices
-
Generate a SparseMatrix
consisting of i.i.d.
uniform random numbers.
- sprand(int, int, double, Random) - Static method in class org.apache.spark.ml.linalg.SparseMatrix
-
Generate a SparseMatrix
consisting of i.i.d
.
- sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a SparseMatrix
consisting of i.i.d.
uniform random numbers.
- sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a SparseMatrix
consisting of i.i.d
.
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.ml.linalg.Matrices
-
Generate a SparseMatrix
consisting of i.i.d.
gaussian random numbers.
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.ml.linalg.SparseMatrix
-
Generate a SparseMatrix
consisting of i.i.d
.
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a SparseMatrix
consisting of i.i.d.
gaussian random numbers.
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a SparseMatrix
consisting of i.i.d
.
- sqdist(Vector, Vector) - Static method in class org.apache.spark.ml.linalg.Vectors
-
Returns the squared distance between two Vectors.
- sqdist(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Returns the squared distance between two Vectors.
- sql(String) - Method in class org.apache.spark.sql.SparkSession
-
Executes a SQL query using Spark, returning the result as a DataFrame
.
- sql(String) - Method in class org.apache.spark.sql.SQLContext
-
- sql() - Method in class org.apache.spark.sql.types.ArrayType
-
- sql() - Static method in class org.apache.spark.sql.types.BinaryType
-
- sql() - Static method in class org.apache.spark.sql.types.BooleanType
-
- sql() - Static method in class org.apache.spark.sql.types.ByteType
-
- sql() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- sql() - Method in class org.apache.spark.sql.types.DataType
-
- sql() - Static method in class org.apache.spark.sql.types.DateType
-
- sql() - Method in class org.apache.spark.sql.types.DecimalType
-
- sql() - Static method in class org.apache.spark.sql.types.DoubleType
-
- sql() - Static method in class org.apache.spark.sql.types.FloatType
-
- sql() - Static method in class org.apache.spark.sql.types.IntegerType
-
- sql() - Static method in class org.apache.spark.sql.types.LongType
-
- sql() - Method in class org.apache.spark.sql.types.MapType
-
- sql() - Static method in class org.apache.spark.sql.types.NullType
-
- sql() - Static method in class org.apache.spark.sql.types.ShortType
-
- sql() - Static method in class org.apache.spark.sql.types.StringType
-
- sql() - Method in class org.apache.spark.sql.types.StructType
-
- sql() - Static method in class org.apache.spark.sql.types.TimestampType
-
- sqlContext() - Method in interface org.apache.spark.ml.util.BaseReadWrite
-
Returns the user-specified SQL context or the default.
- sqlContext() - Method in class org.apache.spark.sql.Dataset
-
- sqlContext() - Method in class org.apache.spark.sql.sources.BaseRelation
-
- sqlContext() - Method in class org.apache.spark.sql.SparkSession
-
A wrapped version of this session in the form of a
SQLContext
, for backward compatibility.
- SQLContext - Class in org.apache.spark.sql
-
The entry point for working with structured data (rows and columns) in Spark 1.x.
- SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
-
- SQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.SQLContext
-
- SQLContext.implicits$ - Class in org.apache.spark.sql
-
Experimental
(Scala-specific) Implicit methods available in Scala for converting
common Scala objects into DataFrame
s.
- SQLDataTypes - Class in org.apache.spark.ml.linalg
-
Developer API
SQL data types for vectors and matrices.
- SQLDataTypes() - Constructor for class org.apache.spark.ml.linalg.SQLDataTypes
-
- SQLImplicits - Class in org.apache.spark.sql
-
A collection of implicit methods for converting common Scala objects into
Dataset
s.
- SQLImplicits() - Constructor for class org.apache.spark.sql.SQLImplicits
-
- SQLImplicits.StringToColumn - Class in org.apache.spark.sql
-
Converts $"col name" into a
Column
.
- SQLTransformer - Class in org.apache.spark.ml.feature
-
Implements the transformations which are defined by SQL statement.
- SQLTransformer(String) - Constructor for class org.apache.spark.ml.feature.SQLTransformer
-
- SQLTransformer() - Constructor for class org.apache.spark.ml.feature.SQLTransformer
-
- sqlType() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- SQLUserDefinedType - Annotation Type in org.apache.spark.sql.types
-
::DeveloperApi::
A user-defined type which can be automatically recognized by a SQLContext and registered.
- SQLUtils - Class in org.apache.spark.sql.api.r
-
- SQLUtils() - Constructor for class org.apache.spark.sql.api.r.SQLUtils
-
- sqrt(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the square root of the specified float value.
- sqrt(String) - Static method in class org.apache.spark.sql.functions
-
Computes the square root of the specified float value.
- Sqrt$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$
-
- SquaredError - Class in org.apache.spark.mllib.tree.loss
-
Developer API
Class for squared error loss calculation.
- SquaredError() - Constructor for class org.apache.spark.mllib.tree.loss.SquaredError
-
- SquaredEuclideanSilhouette - Class in org.apache.spark.ml.evaluation
-
SquaredEuclideanSilhouette computes the average of the
Silhouette over all the data of the dataset, which is
a measure of how appropriately the data have been clustered.
- SquaredEuclideanSilhouette() - Constructor for class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette
-
- SquaredEuclideanSilhouette.ClusterStats - Class in org.apache.spark.ml.evaluation
-
- SquaredEuclideanSilhouette.ClusterStats$ - Class in org.apache.spark.ml.evaluation
-
- SquaredL2Updater - Class in org.apache.spark.mllib.optimization
-
Developer API
Updater for L2 regularized problems.
- SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
-
- squaredNormSum() - Method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats
-
- Src - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose the source and edge fields but not the destination field.
- srcAttr() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex attribute of the edge's source vertex.
- srcAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
-
The source vertex attribute
- srcAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- srcCol() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
-
Param for the name of the input column for source vertex IDs.
- srcId() - Method in class org.apache.spark.graphx.Edge
-
- srcId() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex id of the edge's source vertex.
- srcId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- stackTrace() - Method in class org.apache.spark.ExceptionFailure
-
- StackTrace - Class in org.apache.spark.status.api.v1
-
- StackTrace(Seq<String>) - Constructor for class org.apache.spark.status.api.v1.StackTrace
-
- stackTrace() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
-
- stackTraceFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- stackTraceToJson(StackTraceElement[]) - Static method in class org.apache.spark.util.JsonProtocol
-
- stage() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- STAGE() - Static method in class org.apache.spark.status.TaskIndexNames
-
- STAGE_DAG() - Static method in class org.apache.spark.ui.ToolTips
-
- STAGE_TIMELINE() - Static method in class org.apache.spark.ui.ToolTips
-
- stageAttempt() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- stageAttemptId() - Method in class org.apache.spark.ContextBarrierId
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- stageAttemptNumber() - Method in class org.apache.spark.BarrierTaskContext
-
- stageAttemptNumber() - Method in class org.apache.spark.TaskContext
-
How many times the stage that this task belongs to has been attempted.
- stageCompletedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- stageCompletedToJson(SparkListenerStageCompleted) - Static method in class org.apache.spark.util.JsonProtocol
-
- StageData - Class in org.apache.spark.status.api.v1
-
- stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
-
- stageId() - Method in class org.apache.spark.BarrierTaskContext
-
- stageId() - Method in class org.apache.spark.ContextBarrierId
-
- stageId() - Method in interface org.apache.spark.scheduler.Schedulable
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerSpeculativeTaskSubmitted
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- stageId() - Method in class org.apache.spark.scheduler.StageInfo
-
- stageId() - Method in interface org.apache.spark.SparkStageInfo
-
- stageId() - Method in class org.apache.spark.SparkStageInfoImpl
-
- stageId() - Method in class org.apache.spark.status.api.v1.StageData
-
- stageId() - Method in class org.apache.spark.TaskContext
-
The ID of the stage that this task belong to.
- stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- stageIds() - Method in interface org.apache.spark.SparkJobInfo
-
- stageIds() - Method in class org.apache.spark.SparkJobInfoImpl
-
- stageIds() - Method in class org.apache.spark.status.api.v1.JobData
-
- stageIds() - Method in class org.apache.spark.status.LiveJob
-
- stageIds() - Method in class org.apache.spark.status.SchedulerPool
-
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- StageInfo - Class in org.apache.spark.scheduler
-
Developer API
Stores information about a stage to pass from the scheduler to SparkListeners.
- StageInfo(int, int, String, int, Seq<RDDInfo>, Seq<Object>, String, TaskMetrics, Seq<Seq<TaskLocation>>) - Constructor for class org.apache.spark.scheduler.StageInfo
-
- stageInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
--------------------------------------------------------------------- *
JSON deserialization methods for classes SparkListenerEvents depend on |
- stageInfos() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- stageInfoToJson(StageInfo) - Static method in class org.apache.spark.util.JsonProtocol
-
------------------------------------------------------------------- *
JSON serialization methods for classes SparkListenerEvents depend on |
- stageName() - Method in class org.apache.spark.ml.clustering.InternalKMeansModelWriter
-
- stageName() - Method in class org.apache.spark.ml.clustering.PMMLKMeansModelWriter
-
- stageName() - Method in class org.apache.spark.ml.regression.InternalLinearRegressionModelWriter
-
- stageName() - Method in class org.apache.spark.ml.regression.PMMLLinearRegressionModelWriter
-
- stageName() - Method in interface org.apache.spark.ml.util.MLFormatRegister
-
The string that represents the stage type that this writer supports.
- stages() - Method in class org.apache.spark.ml.Pipeline
-
param for pipeline stages
- stages() - Method in class org.apache.spark.ml.PipelineModel
-
- StageStatus - Enum in org.apache.spark.status.api.v1
-
- stageSubmittedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- stageSubmittedToJson(SparkListenerStageSubmitted) - Static method in class org.apache.spark.util.JsonProtocol
-
- standardization() - Method in interface org.apache.spark.ml.param.shared.HasStandardization
-
Param for whether to standardize the training features before fitting the model.
- StandardNormalGenerator - Class in org.apache.spark.mllib.random
-
Developer API
Generates i.i.d.
- StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
-
- StandardScaler - Class in org.apache.spark.ml.feature
-
Standardizes features by removing the mean and scaling to unit variance using column summary
statistics on the samples in the training set.
- StandardScaler(String) - Constructor for class org.apache.spark.ml.feature.StandardScaler
-
- StandardScaler() - Constructor for class org.apache.spark.ml.feature.StandardScaler
-
- StandardScaler - Class in org.apache.spark.mllib.feature
-
Standardizes features by removing the mean and scaling to unit std using column summary
statistics on the samples in the training set.
- StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-
- StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-
- StandardScalerModel - Class in org.apache.spark.ml.feature
-
- StandardScalerModel - Class in org.apache.spark.mllib.feature
-
Represents a StandardScaler model that can transform vectors.
- StandardScalerModel(Vector, Vector, boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-
- StandardScalerModel(Vector, Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-
- StandardScalerModel(Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-
- StandardScalerParams - Interface in org.apache.spark.ml.feature
-
- starGraph(SparkContext, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Create a star graph with vertex 0 being the center.
- start() - Method in interface org.apache.spark.metrics.sink.Sink
-
- start() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- start() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- start(String) - Method in class org.apache.spark.sql.streaming.DataStreamWriter
-
Starts the execution of the streaming query, which will continually output results to the given
path as new data arrives.
- start() - Method in class org.apache.spark.sql.streaming.DataStreamWriter
-
Starts the execution of the streaming query, which will continually output results to the given
path as new data arrives.
- start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Start the execution of the streams.
- start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- start() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
Method called to start receiving data.
- start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- start() - Method in class org.apache.spark.streaming.StreamingContext
-
Start the execution of the streams.
- startApplication(SparkAppHandle.Listener...) - Method in class org.apache.spark.launcher.AbstractLauncher
-
Starts a Spark application.
- startApplication(SparkAppHandle.Listener...) - Method in class org.apache.spark.launcher.InProcessLauncher
-
Starts a Spark application.
- startApplication(SparkAppHandle.Listener...) - Method in class org.apache.spark.launcher.SparkLauncher
-
Starts a Spark application.
- startIndexInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the index of the first node in the given level.
- startJettyServer(String, int, org.apache.spark.SSLOptions, Seq<ServletContextHandler>, SparkConf, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Attempt to start a Jetty server bound to the supplied hostName:port using the given
context handlers.
- startOffset() - Method in class org.apache.spark.sql.streaming.SourceProgress
-
- startOffset() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
-
- startPosition() - Method in exception org.apache.spark.sql.AnalysisException
-
- startServiceOnPort(int, Function1<Object, Tuple2<T, Object>>, SparkConf, String) - Static method in class org.apache.spark.util.Utils
-
Attempt to start a service on the given port, or fail after a number of attempts.
- startsWith(Column) - Method in class org.apache.spark.sql.Column
-
String starts with.
- startsWith(String) - Method in class org.apache.spark.sql.Column
-
String starts with another string literal.
- startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- startTime() - Method in class org.apache.spark.SparkContext
-
- startTime() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- startTime() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
-
- startTime() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
-
- startTime() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-
- stat() - Method in class org.apache.spark.sql.Dataset
-
- StatCounter - Class in org.apache.spark.util
-
A class for tracking the statistics of a set of numbers (count, mean and variance) in a
numerically robust way.
- StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
-
- StatCounter() - Constructor for class org.apache.spark.util.StatCounter
-
Initialize the StatCounter with no values.
- state() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- state() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-
- State<S> - Class in org.apache.spark.streaming
-
Experimental
Abstract class for getting and updating the state in mapping function used in the
mapWithState
operation of a
pair DStream
(Scala)
or a
JavaPairDStream
(Java).
- State() - Constructor for class org.apache.spark.streaming.State
-
- stateChanged(SparkAppHandle) - Method in interface org.apache.spark.launcher.SparkAppHandle.Listener
-
Callback for changes in the handle's state.
- statement() - Method in class org.apache.spark.ml.feature.SQLTransformer
-
SQL statement parameter.
- StateOperatorProgress - Class in org.apache.spark.sql.streaming
-
Information about updates made to stateful operators in a
StreamingQuery
during a trigger.
- stateOperators() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
- stateSnapshots() - Method in class org.apache.spark.streaming.api.java.JavaMapWithStateDStream
-
- stateSnapshots() - Method in class org.apache.spark.streaming.dstream.MapWithStateDStream
-
Return a pair DStream where each RDD is the snapshot of the state of all the keys.
- StateSpec<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming
-
Experimental
Abstract class representing all the specifications of the DStream transformation
mapWithState
operation of a
pair DStream
(Scala) or a
JavaPairDStream
(Java).
- StateSpec() - Constructor for class org.apache.spark.streaming.StateSpec
-
- staticPageRank(int, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run PageRank for a fixed number of iterations returning a graph with vertex attributes
containing the PageRank and edge attributes the normalized edge weight.
- staticParallelPersonalizedPageRank(long[], int, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run parallel personalized PageRank for a given array of source vertices, such
that all random walks are started relative to the source vertices
- staticPersonalizedPageRank(long, int, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run Personalized PageRank for a fixed number of iterations with
with all iterations originating at the source node
returning a graph with vertex attributes
containing the PageRank and edge attributes the normalized edge weight.
- StaticSources - Class in org.apache.spark.metrics.source
-
- StaticSources() - Constructor for class org.apache.spark.metrics.source.StaticSources
-
- statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- statistic() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
-
- statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
Test statistic.
- Statistics - Class in org.apache.spark.mllib.stat
-
API for statistical functions in MLlib.
- Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
-
- Statistics - Interface in org.apache.spark.sql.sources.v2.reader
-
- stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a
StatCounter
object that captures the mean, variance and
count of the RDD's elements in one operation.
- stats() - Method in class org.apache.spark.mllib.tree.model.Node
-
- stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Return a
StatCounter
object that captures the mean, variance and
count of the RDD's elements in one operation.
- StatsdMetricType - Class in org.apache.spark.metrics.sink
-
- StatsdMetricType() - Constructor for class org.apache.spark.metrics.sink.StatsdMetricType
-
- StatsReportListener - Class in org.apache.spark.scheduler
-
Developer API
Simple SparkListener that logs a few summary statistics when each stage completes.
- StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
-
- StatsReportListener - Class in org.apache.spark.streaming.scheduler
-
Developer API
A simple StreamingListener that logs summary statistics across Spark Streaming batches
param: numBatchInfos Number of last batches to consider for generating statistics (default: 10)
- StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
-
- status() - Method in class org.apache.spark.scheduler.TaskInfo
-
- status() - Method in interface org.apache.spark.SparkJobInfo
-
- status() - Method in class org.apache.spark.SparkJobInfoImpl
-
- status() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
Returns the current status of the query.
- status() - Method in class org.apache.spark.status.api.v1.JobData
-
- status() - Method in class org.apache.spark.status.api.v1.StageData
-
- status() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
-
- status() - Method in class org.apache.spark.status.api.v1.TaskData
-
- status() - Method in class org.apache.spark.status.LiveJob
-
- status() - Method in class org.apache.spark.status.LiveStage
-
- STATUS() - Static method in class org.apache.spark.status.TaskIndexNames
-
- status() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockLocationsAndStatus
-
- statusTracker() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- statusTracker() - Method in class org.apache.spark.SparkContext
-
- StatusUpdate(String, long, Enumeration.Value, org.apache.spark.util.SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- StatusUpdate - Class in org.apache.spark.scheduler.local
-
- StatusUpdate(long, Enumeration.Value, ByteBuffer) - Constructor for class org.apache.spark.scheduler.local.StatusUpdate
-
- StatusUpdate$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
-
- STD() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
-
- std() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- std() - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- std() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- std() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- stddev(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: alias for stddev_samp
.
- stddev(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: alias for stddev_samp
.
- stddev_pop(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the population standard deviation of
the expression in a group.
- stddev_pop(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the population standard deviation of
the expression in a group.
- stddev_samp(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sample standard deviation of
the expression in a group.
- stddev_samp(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sample standard deviation of
the expression in a group.
- stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the population standard deviation of this RDD's elements.
- stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the population standard deviation of this RDD's elements.
- stdev() - Method in class org.apache.spark.util.StatCounter
-
Return the population standard deviation of the values.
- stepSize() - Method in interface org.apache.spark.ml.param.shared.HasStepSize
-
Param for Step size to be used for each iteration of optimization (> 0).
- stepSize() - Method in interface org.apache.spark.ml.tree.GBTParams
-
Param for Step size (a.k.a.
- stop() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Shut down the SparkContext.
- stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
-
- stop() - Method in interface org.apache.spark.launcher.SparkAppHandle
-
Asks the application to stop.
- stop() - Method in interface org.apache.spark.metrics.sink.Sink
-
- stop() - Method in interface org.apache.spark.rpc.RpcEndpoint
-
- stop() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- stop() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- stop() - Method in class org.apache.spark.SparkContext
-
Shut down the SparkContext.
- stop() - Method in class org.apache.spark.sql.SparkSession
-
Stop the underlying SparkContext
.
- stop() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
-
Stops the execution of this query if it is running.
- stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- stop() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
Method called to stop receiving data.
- stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Stop the receiver completely.
- stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Stop the receiver completely due to an exception
- stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext
-
Stop the execution of the streams immediately (does not wait for all received data
to be processed).
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext
-
Stop the execution of the streams, with option of ensuring all received data
has been processed.
- StopAllReceivers - Class in org.apache.spark.streaming.scheduler
-
This message will trigger ReceiverTrackerEndpoint to send stop signals to all registered
receivers.
- StopAllReceivers() - Constructor for class org.apache.spark.streaming.scheduler.StopAllReceivers
-
- StopBlockManagerMaster$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
-
- StopCoordinator - Class in org.apache.spark.scheduler
-
- StopCoordinator() - Constructor for class org.apache.spark.scheduler.StopCoordinator
-
- StopDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
-
- StopExecutor - Class in org.apache.spark.scheduler.local
-
- StopExecutor() - Constructor for class org.apache.spark.scheduler.local.StopExecutor
-
- StopExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
-
- StopExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
-
- StopMapOutputTracker - Class in org.apache.spark
-
- StopMapOutputTracker() - Constructor for class org.apache.spark.StopMapOutputTracker
-
- StopReceiver - Class in org.apache.spark.streaming.receiver
-
- StopReceiver() - Constructor for class org.apache.spark.streaming.receiver.StopReceiver
-
- stopWords() - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
The words to be filtered out.
- StopWordsRemover - Class in org.apache.spark.ml.feature
-
A feature transformer that filters out stop words from input.
- StopWordsRemover(String) - Constructor for class org.apache.spark.ml.feature.StopWordsRemover
-
- StopWordsRemover() - Constructor for class org.apache.spark.ml.feature.StopWordsRemover
-
- storage() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
-
- STORAGE_MEMORY() - Static method in class org.apache.spark.ui.ToolTips
-
- storageLevel() - Method in class org.apache.spark.sql.Dataset
-
Get the Dataset's current storage level, or StorageLevel.NONE if not persisted.
- storageLevel() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
-
- storageLevel() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- storageLevel() - Method in class org.apache.spark.status.LiveRDD
-
- storageLevel() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- storageLevel() - Method in class org.apache.spark.storage.BlockStatus
-
- storageLevel() - Method in class org.apache.spark.storage.BlockUpdatedInfo
-
- storageLevel() - Method in class org.apache.spark.storage.RDDInfo
-
- StorageLevel - Class in org.apache.spark.storage
-
Developer API
Flags for controlling the storage of an RDD.
- StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
-
- storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
-
- storageLevelFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- StorageLevels - Class in org.apache.spark.api.java
-
Expose some commonly useful storage level constants.
- StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
-
- storageLevelToJson(StorageLevel) - Static method in class org.apache.spark.util.JsonProtocol
-
- StorageUtils - Class in org.apache.spark.storage
-
Helper methods for storage-related objects.
- StorageUtils() - Constructor for class org.apache.spark.storage.StorageUtils
-
- store(T) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store a single item of received data to Spark's memory.
- store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store the bytes of received data as a data block into Spark's memory.
- store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store the bytes of received data as a data block into Spark's memory.
- storeBlock(StreamBlockId, ReceivedBlock) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
-
Store a received block with the given block id and return related metadata
- storeValue(T) - Method in class org.apache.spark.storage.memory.DeserializedValuesHolder
-
- storeValue(T) - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
-
- storeValue(T) - Method in interface org.apache.spark.storage.memory.ValuesHolder
-
- strategy() - Method in interface org.apache.spark.ml.feature.ImputerParams
-
The imputation strategy.
- Strategy - Class in org.apache.spark.mllib.tree.configuration
-
Stores all the configuration options for tree construction
param: algo Learning goal.
- Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int, double, int, double, boolean, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
-
- Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
-
- StratifiedSamplingUtils - Class in org.apache.spark.util.random
-
Auxiliary functions and data structures for the sampleByKey method in PairRDDFunctions.
- StratifiedSamplingUtils() - Constructor for class org.apache.spark.util.random.StratifiedSamplingUtils
-
- STREAM() - Static method in class org.apache.spark.storage.BlockId
-
- StreamBlockId - Class in org.apache.spark.storage
-
- StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
-
- streamId() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
-
- streamId() - Method in class org.apache.spark.storage.StreamBlockId
-
- streamId() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Get the unique identifier the receiver input stream that this
receiver is associated with.
- streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- streamIdToInputInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- StreamingContext - Class in org.apache.spark.streaming
-
Main entry point for Spark Streaming functionality.
- StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext using an existing SparkContext.
- StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext by providing the configuration necessary for a new SparkContext.
- StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext by providing the details necessary for creating a new SparkContext.
- StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file.
- StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file.
- StreamingContext(String, SparkContext) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file using an existing SparkContext.
- StreamingContextPythonHelper - Class in org.apache.spark.streaming
-
- StreamingContextPythonHelper() - Constructor for class org.apache.spark.streaming.StreamingContextPythonHelper
-
- StreamingContextState - Enum in org.apache.spark.streaming
-
Developer API
Represents the state of a StreamingContext.
- StreamingKMeans - Class in org.apache.spark.mllib.clustering
-
StreamingKMeans provides methods for configuring a
streaming k-means analysis, training the model on streaming,
and using the model to make predictions on streaming data.
- StreamingKMeans(int, double, String) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
-
- StreamingKMeans() - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
-
- StreamingKMeansModel - Class in org.apache.spark.mllib.clustering
-
StreamingKMeansModel extends MLlib's KMeansModel for streaming
algorithms, so it can keep track of a continuously updated weight
associated with each cluster, and also update the model by
doing a single iteration of the standard k-means algorithm.
- StreamingKMeansModel(Vector[], double[]) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression
-
Developer API
StreamingLinearAlgorithm implements methods for continuously
training a generalized linear model on streaming data,
and using it for prediction on (possibly different) streaming data.
- StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
- StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
Train or predict a linear regression model on streaming data.
- StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Construct a StreamingLinearRegression object with default parameters:
{stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
- StreamingListener - Interface in org.apache.spark.streaming.scheduler
-
Developer API
A listener interface for receiving information about an ongoing streaming
computation.
- StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
-
- StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
-
- StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
-
- StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler
-
Developer API
Base trait for events related to StreamingListener
- StreamingListenerOutputOperationCompleted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerOutputOperationCompleted(OutputOperationInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
-
- StreamingListenerOutputOperationStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerOutputOperationStarted(OutputOperationInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
-
- StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- StreamingListenerStreamingStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerStreamingStarted(long) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerStreamingStarted
-
- StreamingLogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
-
Train or predict a logistic regression model on streaming data.
- StreamingLogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Construct a StreamingLogisticRegression object with default parameters:
{stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}.
- StreamingQuery - Interface in org.apache.spark.sql.streaming
-
A handle to a query that is executing continuously in the background as new data arrives.
- StreamingQueryException - Exception in org.apache.spark.sql.streaming
-
- StreamingQueryListener - Class in org.apache.spark.sql.streaming
-
- StreamingQueryListener() - Constructor for class org.apache.spark.sql.streaming.StreamingQueryListener
-
- StreamingQueryListener.Event - Interface in org.apache.spark.sql.streaming
-
- StreamingQueryListener.QueryProgressEvent - Class in org.apache.spark.sql.streaming
-
Event representing any progress updates in a query.
- StreamingQueryListener.QueryStartedEvent - Class in org.apache.spark.sql.streaming
-
Event representing the start of a query
param: id A unique query id that persists across restarts.
- StreamingQueryListener.QueryTerminatedEvent - Class in org.apache.spark.sql.streaming
-
Event representing that termination of a query.
- StreamingQueryManager - Class in org.apache.spark.sql.streaming
-
- StreamingQueryProgress - Class in org.apache.spark.sql.streaming
-
Information about progress made in the execution of a
StreamingQuery
during
a trigger.
- StreamingQueryStatus - Class in org.apache.spark.sql.streaming
-
Reports information about the instantaneous status of a streaming query.
- StreamingStatistics - Class in org.apache.spark.status.api.v1.streaming
-
- StreamingTest - Class in org.apache.spark.mllib.stat.test
-
Performs online 2-sample significance testing for a stream of (Boolean, Double) pairs.
- StreamingTest() - Constructor for class org.apache.spark.mllib.stat.test.StreamingTest
-
- StreamingTestMethod - Interface in org.apache.spark.mllib.stat.test
-
- StreamInputInfo - Class in org.apache.spark.streaming.scheduler
-
Developer API
Track the information of input stream at specified batch time.
- StreamInputInfo(int, long, Map<String, Object>) - Constructor for class org.apache.spark.streaming.scheduler.StreamInputInfo
-
- streamName() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
-
- streams() - Method in class org.apache.spark.sql.SparkSession
-
Experimental
Returns a StreamingQueryManager
that allows managing all the
StreamingQuery
s active on this
.
- streams() - Method in class org.apache.spark.sql.SQLContext
-
- StreamSinkProvider - Interface in org.apache.spark.sql.sources
-
::Experimental::
Implemented by objects that can produce a streaming Sink
for a specific format or system.
- StreamSourceProvider - Interface in org.apache.spark.sql.sources
-
::Experimental::
Implemented by objects that can produce a streaming Source
for a specific format or system.
- StreamWriter - Interface in org.apache.spark.sql.sources.v2.writer.streaming
-
- StreamWriteSupport - Interface in org.apache.spark.sql.sources.v2
-
- STRING() - Static method in class org.apache.spark.api.r.SerializationFormats
-
- string() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type string.
- STRING() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for nullable string type.
- StringAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.StringAccumulatorParam$
-
Deprecated.
- StringArrayParam - Class in org.apache.spark.ml.param
-
Developer API
Specialized version of Param[Array[String}
for Java.
- StringArrayParam(Params, String, String, Function1<String[], Object>) - Constructor for class org.apache.spark.ml.param.StringArrayParam
-
- StringArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.StringArrayParam
-
- StringContains - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to
a string that contains the string value
.
- StringContains(String, String) - Constructor for class org.apache.spark.sql.sources.StringContains
-
- StringEndsWith - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to
a string that ends with value
.
- StringEndsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringEndsWith
-
- stringHalfWidth(String) - Static method in class org.apache.spark.util.Utils
-
Return the number of half widths in a given string.
- StringIndexer - Class in org.apache.spark.ml.feature
-
A label indexer that maps a string column of labels to an ML column of label indices.
- StringIndexer(String) - Constructor for class org.apache.spark.ml.feature.StringIndexer
-
- StringIndexer() - Constructor for class org.apache.spark.ml.feature.StringIndexer
-
- StringIndexerBase - Interface in org.apache.spark.ml.feature
-
- StringIndexerModel - Class in org.apache.spark.ml.feature
-
- StringIndexerModel(String, String[]) - Constructor for class org.apache.spark.ml.feature.StringIndexerModel
-
- StringIndexerModel(String[]) - Constructor for class org.apache.spark.ml.feature.StringIndexerModel
-
- stringIndexerOrderType() - Method in interface org.apache.spark.ml.feature.RFormulaBase
-
Param for how to order categories of a string FEATURE column used by StringIndexer
.
- stringOrderType() - Method in interface org.apache.spark.ml.feature.StringIndexerBase
-
Param for how to order labels of string column.
- StringRRDD<T> - Class in org.apache.spark.api.r
-
An RDD that stores R objects as Array[String].
- StringRRDD(RDD<T>, byte[], String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.StringRRDD
-
- StringStartsWith - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to
a string that starts with value
.
- StringStartsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringStartsWith
-
- StringToColumn(StringContext) - Constructor for class org.apache.spark.sql.SQLImplicits.StringToColumn
-
- stringToSeq(String, Function1<String, T>) - Static method in class org.apache.spark.internal.config.ConfigHelpers
-
- stringToSeq(String) - Static method in class org.apache.spark.util.Utils
-
- StringType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the StringType object.
- StringType - Class in org.apache.spark.sql.types
-
The data type representing String
values.
- StringType() - Constructor for class org.apache.spark.sql.types.StringType
-
- stripXSS(String) - Static method in class org.apache.spark.ui.UIUtils
-
Remove suspicious characters of user input to prevent Cross-Site scripting (XSS) attacks
- stronglyConnectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps
-
Compute the strongly connected component (SCC) of each vertex and return a graph with the
vertex value containing the lowest vertex id in the SCC containing that vertex.
- StronglyConnectedComponents - Class in org.apache.spark.graphx.lib
-
Strongly connected components algorithm implementation.
- StronglyConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.StronglyConnectedComponents
-
- struct(Seq<StructField>) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type struct.
- struct(StructType) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type struct.
- struct(Column...) - Static method in class org.apache.spark.sql.functions
-
Creates a new struct column.
- struct(String, String...) - Static method in class org.apache.spark.sql.functions
-
Creates a new struct column that composes multiple input columns.
- struct(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Creates a new struct column.
- struct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Creates a new struct column that composes multiple input columns.
- StructField - Class in org.apache.spark.sql.types
-
A field inside a StructType.
- StructField(String, DataType, boolean, Metadata) - Constructor for class org.apache.spark.sql.types.StructField
-
- StructType - Class in org.apache.spark.sql.types
-
- StructType(StructField[]) - Constructor for class org.apache.spark.sql.types.StructType
-
- StructType() - Constructor for class org.apache.spark.sql.types.StructType
-
No-arg constructor for kryo.
- stsCredentials(String, String) - Method in class org.apache.spark.streaming.kinesis.SparkAWSCredentials.Builder
-
Use STS to assume an IAM role for temporary session-based authentication.
- stsCredentials(String, String, String) - Method in class org.apache.spark.streaming.kinesis.SparkAWSCredentials.Builder
-
Use STS to assume an IAM role for temporary session-based authentication.
- StudentTTest - Class in org.apache.spark.mllib.stat.test
-
Performs Students's 2-sample t-test.
- StudentTTest() - Constructor for class org.apache.spark.mllib.stat.test.StudentTTest
-
- subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.Graph
-
Restricts the graph to only the vertices and edges satisfying the predicates.
- subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- submissionTime() - Method in class org.apache.spark.scheduler.StageInfo
-
When this stage was submitted from the DAGScheduler to a TaskScheduler.
- submissionTime() - Method in interface org.apache.spark.SparkStageInfo
-
- submissionTime() - Method in class org.apache.spark.SparkStageInfoImpl
-
- submissionTime() - Method in class org.apache.spark.status.api.v1.JobData
-
- submissionTime() - Method in class org.apache.spark.status.api.v1.StageData
-
- submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in interface org.apache.spark.JobSubmitter
-
Submit a job for execution and return a FutureAction holding the result.
- submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext
-
Submit a job for execution and return a FutureJob holding the result.
- submitTasks(TaskSet) - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- subModels() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- subModels() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
-
- subsamplingRate() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
For Online optimizer only: optimizer
= "online".
- subsamplingRate() - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams
-
Fraction of the training data used for learning each decision tree, in range (0, 1].
- subsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- subsetAccuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns subset accuracy
(for equal sets of labels)
- substituteAppId(String, String) - Static method in class org.apache.spark.util.Utils
-
Replaces all the {{APP_ID}} occurrences with the App Id.
- substituteAppNExecIds(String, String, String) - Static method in class org.apache.spark.util.Utils
-
Replaces all the {{EXECUTOR_ID}} occurrences with the Executor Id
and {{APP_ID}} occurrences with the App Id.
- substr(Column, Column) - Method in class org.apache.spark.sql.Column
-
An expression that returns a substring.
- substr(int, int) - Method in class org.apache.spark.sql.Column
-
An expression that returns a substring.
- substring(Column, int, int) - Static method in class org.apache.spark.sql.functions
-
Substring starts at pos
and is of length len
when str is String type or
returns the slice of byte array that starts at pos
in byte and is of length len
when str is Binary type
- substring_index(Column, String, int) - Static method in class org.apache.spark.sql.functions
-
Returns the substring from string str before count occurrences of the delimiter delim.
- subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Subtracts the given block matrix other
from this
block matrix: this - other
.
- subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
- subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractMetrics(TaskMetrics, TaskMetrics) - Static method in class org.apache.spark.status.LiveEntityHelpers
-
Subtract m2 values from m1.
- succeededTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- succeededTasks() - Method in class org.apache.spark.status.LiveExecutorStageSummary
-
- success(T) - Static method in class org.apache.spark.ml.feature.RFormulaParser
-
- Success - Class in org.apache.spark
-
Developer API
Task succeeded.
- Success() - Constructor for class org.apache.spark.Success
-
- successful() - Method in class org.apache.spark.scheduler.TaskInfo
-
- sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Add up the elements in this RDD.
- Sum() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
-
- sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Add up the elements in this RDD.
- sum(MapFunction<T, Double>) - Static method in class org.apache.spark.sql.expressions.javalang.typed
-
Sum aggregate function for floating point (double) type.
- sum(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed
-
Sum aggregate function for floating point (double) type.
- sum(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of all values in the expression.
- sum(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of all values in the given column.
- sum(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the sum for each numeric columns for each group.
- sum(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
Compute the sum for each numeric columns for each group.
- sum() - Method in class org.apache.spark.util.DoubleAccumulator
-
Returns the sum of elements added to the accumulator.
- sum() - Method in class org.apache.spark.util.LongAccumulator
-
Returns the sum of elements added to the accumulator.
- sum() - Method in class org.apache.spark.util.StatCounter
-
- sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Approximate operation to return the sum within a timeout.
- sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Approximate operation to return the sum within a timeout.
- sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Approximate operation to return the sum within a timeout.
- sumDistinct(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of distinct values in the expression.
- sumDistinct(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of distinct values in the expression.
- sumLong(MapFunction<T, Long>) - Static method in class org.apache.spark.sql.expressions.javalang.typed
-
Sum aggregate function for integral (long, i.e.
- sumLong(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed
-
Sum aggregate function for integral (long, i.e.
- Summarizer - Class in org.apache.spark.ml.stat
-
Tools for vectorized statistics on MLlib Vectors.
- Summarizer() - Constructor for class org.apache.spark.ml.stat.Summarizer
-
- summary() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
Gets summary of model on training set.
- summary() - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
Gets summary of model on training set.
- summary() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
Gets summary of model on training set.
- summary() - Method in class org.apache.spark.ml.clustering.KMeansModel
-
Gets summary of model on training set.
- summary() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
-
Gets R-like summary of model on training set.
- summary() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
Gets summary (e.g.
- summary(Column, Column) - Method in class org.apache.spark.ml.stat.SummaryBuilder
-
Returns an aggregate object that contains the summary of the column with the requested metrics.
- summary(Column) - Method in class org.apache.spark.ml.stat.SummaryBuilder
-
- summary(String...) - Method in class org.apache.spark.sql.Dataset
-
Computes specified statistics for numeric and string columns.
- summary(Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Computes specified statistics for numeric and string columns.
- SummaryBuilder - Class in org.apache.spark.ml.stat
-
A builder object that provides summary statistics about a given column.
- SummaryBuilder() - Constructor for class org.apache.spark.ml.stat.SummaryBuilder
-
- supportDataType(DataType, boolean) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
-
Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
-
Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.mllib.tree.RandomForest
-
List of supported feature subset sampling strategies.
- supportedImpurities() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
Accessor for supported impurities: entropy, gini
- supportedImpurities() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
-
Accessor for supported impurity settings: entropy, gini
- supportedImpurities() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
Accessor for supported impurities: variance
- supportedImpurities() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
-
Accessor for supported impurity settings: variance
- supportedLossTypes() - Static method in class org.apache.spark.ml.classification.GBTClassifier
-
Accessor for supported loss settings: logistic
- supportedLossTypes() - Static method in class org.apache.spark.ml.regression.GBTRegressor
-
Accessor for supported loss settings: squared (L2), absolute (L1)
- supportedOptimizers() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
Supported values for Param optimizer
.
- supportedSelectorTypes() - Static method in class org.apache.spark.mllib.feature.ChiSqSelector
-
Set of selector types that ChiSqSelector supports.
- SupportsPushDownFilters - Interface in org.apache.spark.sql.sources.v2.reader
-
- SupportsPushDownRequiredColumns - Interface in org.apache.spark.sql.sources.v2.reader
-
- SupportsReportPartitioning - Interface in org.apache.spark.sql.sources.v2.reader
-
- SupportsReportStatistics - Interface in org.apache.spark.sql.sources.v2.reader
-
- SupportsScanColumnarBatch - Interface in org.apache.spark.sql.sources.v2.reader
-
- surrogateDF() - Method in class org.apache.spark.ml.feature.ImputerModel
-
- SVDPlusPlus - Class in org.apache.spark.graphx.lib
-
Implementation of SVD++ algorithm.
- SVDPlusPlus() - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus
-
- SVDPlusPlus.Conf - Class in org.apache.spark.graphx.lib
-
Configuration parameters for SVDPlusPlus.
- SVMDataGenerator - Class in org.apache.spark.mllib.util
-
Developer API
Generate sample data used for SVM.
- SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
-
- SVMModel - Class in org.apache.spark.mllib.classification
-
Model for Support Vector Machines (SVMs).
- SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
-
- SVMWithSGD - Class in org.apache.spark.mllib.classification
-
Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
- SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD
-
Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100,
regParm: 0.01, miniBatchFraction: 1.0}.
- symbolToColumn(Symbol) - Method in class org.apache.spark.sql.SQLImplicits
-
An implicit conversion that turns a Scala
Symbol
into a
Column
.
- symlink(File, File) - Static method in class org.apache.spark.util.Utils
-
Creates a symlink.
- symmetricEigs(Function1<DenseVector<Object>, DenseVector<Object>>, int, int, double, int) - Static method in class org.apache.spark.mllib.linalg.EigenValueDecomposition
-
Compute the leading k eigenvalues and eigenvectors on a symmetric square matrix using ARPACK.
- syr(double, Vector, DenseMatrix) - Static method in class org.apache.spark.ml.linalg.BLAS
-
A := alpha * x * x^T^ + A
- syr(double, Vector, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
A := alpha * x * x^T^ + A
- SYSTEM_DEFAULT() - Static method in class org.apache.spark.sql.types.DecimalType
-
- systemProperties() - Method in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo
-
- t() - Method in class org.apache.spark.SerializableWritable
-
- Table - Class in org.apache.spark.sql.catalog
-
A table in Spark, as returned by the
listTables
method in
Catalog
.
- Table(String, String, String, String, boolean) - Constructor for class org.apache.spark.sql.catalog.Table
-
- table(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Returns the specified table as a DataFrame
.
- table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- table(String) - Method in class org.apache.spark.sql.SparkSession
-
Returns the specified table/view as a DataFrame
.
- table(String) - Method in class org.apache.spark.sql.SQLContext
-
- table(int) - Method in interface org.apache.spark.ui.PagedTable
-
- TABLE_CLASS_NOT_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
-
- TABLE_CLASS_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
-
- TABLE_CLASS_STRIPED_SORTABLE() - Static method in class org.apache.spark.ui.UIUtils
-
- TABLE_KEY - Static variable in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
The option key for table name.
- tableCssClass() - Method in interface org.apache.spark.ui.PagedTable
-
- tableDesc() - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
-
- tableExists(String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Check if the table or view with the specified name exists.
- tableExists(String, String) - Method in class org.apache.spark.sql.catalog.Catalog
-
Check if the table or view with the specified name exists in the specified database.
- tableExists(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient
-
Return whether a table/view with the specified name exists.
- tableId() - Method in interface org.apache.spark.ui.PagedTable
-
- tableName() - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
-
Returns the value of the table name option.
- tableNames() - Method in class org.apache.spark.sql.SQLContext
-
- tableNames(String) - Method in class org.apache.spark.sql.SQLContext
-
- TableReader - Interface in org.apache.spark.sql.hive
-
A trait for subclasses that handle table scans.
- tables() - Method in class org.apache.spark.sql.SQLContext
-
- tables(String) - Method in class org.apache.spark.sql.SQLContext
-
- TableScan - Interface in org.apache.spark.sql.sources
-
A BaseRelation that can produce all of its tuples as an RDD of Row objects.
- tableType() - Method in class org.apache.spark.sql.catalog.Table
-
- take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Take the first num elements of the RDD.
- take(int) - Method in class org.apache.spark.rdd.RDD
-
Take the first num elements of the RDD.
- take(int) - Method in class org.apache.spark.sql.Dataset
-
Returns the first n
rows in the Dataset.
- takeAsList(int) - Method in class org.apache.spark.sql.Dataset
-
Returns the first n
rows in the Dataset as a list.
- takeAsync(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of the take
action, which returns a
future for retrieving the first num
elements of this RDD.
- takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for retrieving the first num elements of the RDD.
- takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the first k (smallest) elements from this RDD as defined by
the specified Comparator[T] and maintains the order.
- takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the first k (smallest) elements from this RDD using the
natural ordering for T while maintain the order.
- takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the first k (smallest) elements from this RDD as defined by the specified
implicit Ordering[T] and maintains the ordering.
- takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD
-
Return a fixed-size sampled subset of this RDD in an array
- tallSkinnyQR(boolean) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- tan(Column) - Static method in class org.apache.spark.sql.functions
-
- tan(String) - Static method in class org.apache.spark.sql.functions
-
- tanh(Column) - Static method in class org.apache.spark.sql.functions
-
- tanh(String) - Static method in class org.apache.spark.sql.functions
-
- targetStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- targetStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- task() - Method in class org.apache.spark.CleanupTaskWeakReference
-
- TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- TASK_INDEX() - Static method in class org.apache.spark.status.TaskIndexNames
-
- TASK_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- taskAttemptId() - Method in class org.apache.spark.BarrierTaskContext
-
- taskAttemptId() - Method in class org.apache.spark.TaskContext
-
An ID that is unique to this task attempt (within the same SparkContext, no two task attempts
will share the same attempt ID).
- TaskCommitDenied - Class in org.apache.spark
-
Developer API
Task requested the driver to commit, but was denied.
- TaskCommitDenied(int, int, int) - Constructor for class org.apache.spark.TaskCommitDenied
-
- TaskCommitMessage(Object) - Constructor for class org.apache.spark.internal.io.FileCommitProtocol.TaskCommitMessage
-
- TaskCompletionListener - Interface in org.apache.spark.util
-
Developer API
- TaskContext - Class in org.apache.spark
-
Contextual information about a task which can be read or mutated during
execution.
- TaskContext() - Constructor for class org.apache.spark.TaskContext
-
- TaskData - Class in org.apache.spark.status.api.v1
-
- TaskDetailsClassNames - Class in org.apache.spark.ui.jobs
-
Names of the CSS classes corresponding to each type of task detail.
- TaskDetailsClassNames() - Constructor for class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- taskEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskEndReason - Interface in org.apache.spark
-
Developer API
Various possible reasons why a task ended.
- taskEndReasonFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskEndReasonToJson(TaskEndReason) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskEndToJson(SparkListenerTaskEnd) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskFailedReason - Interface in org.apache.spark
-
Developer API
Various possible reasons why a task failed.
- TaskFailureListener - Interface in org.apache.spark.util
-
Developer API
- taskFailures() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
-
- taskFailures() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
-
- taskGettingResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskGettingResultToJson(SparkListenerTaskGettingResult) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
-
- taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- taskId() - Method in class org.apache.spark.scheduler.local.KillTask
-
- taskId() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-
- taskId() - Method in class org.apache.spark.scheduler.TaskInfo
-
- taskId() - Method in class org.apache.spark.status.api.v1.TaskData
-
- taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
-
- TaskIndexNames - Class in org.apache.spark.status
-
Tasks have a lot of indices that are used in a few different places.
- TaskIndexNames() - Constructor for class org.apache.spark.status.TaskIndexNames
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- TaskInfo - Class in org.apache.spark.scheduler
-
Developer API
Information about a running task attempt inside a TaskSet.
- TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
-
- taskInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskInfoToJson(TaskInfo) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskKilled - Class in org.apache.spark
-
Developer API
Task was killed intentionally and needs to be rescheduled.
- TaskKilled(String, Seq<AccumulableInfo>, Seq<AccumulatorV2<?, ?>>) - Constructor for class org.apache.spark.TaskKilled
-
- TaskKilledException - Exception in org.apache.spark
-
Developer API
Exception thrown when a task is explicitly killed (i.e., task failure is expected).
- TaskKilledException(String) - Constructor for exception org.apache.spark.TaskKilledException
-
- TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
-
- taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
-
- TaskLocality - Class in org.apache.spark.scheduler
-
- TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
-
- taskLocality() - Method in class org.apache.spark.status.api.v1.TaskData
-
- TaskLocation - Interface in org.apache.spark.scheduler
-
A location where a task should run.
- TaskMetricDistributions - Class in org.apache.spark.status.api.v1
-
- taskMetrics() - Method in class org.apache.spark.BarrierTaskContext
-
- taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- taskMetrics() - Method in class org.apache.spark.scheduler.StageInfo
-
- taskMetrics() - Method in class org.apache.spark.status.api.v1.TaskData
-
- TaskMetrics - Class in org.apache.spark.status.api.v1
-
- taskMetrics() - Method in class org.apache.spark.TaskContext
-
- taskMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskMetricsToJson(TaskMetrics) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskResult<T> - Interface in org.apache.spark.scheduler
-
- TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
-
- TaskResultBlockId - Class in org.apache.spark.storage
-
- TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
-
- TaskResultLost - Class in org.apache.spark
-
Developer API
The task finished successfully, but the result was lost from the executor's block manager before
it was fetched.
- TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
-
- tasks() - Method in class org.apache.spark.status.api.v1.StageData
-
- TaskScheduler - Interface in org.apache.spark.scheduler
-
Low-level task scheduler interface, currently implemented exclusively by
TaskSchedulerImpl
.
- TaskSchedulerIsSet - Class in org.apache.spark
-
An event that SparkContext uses to notify HeartbeatReceiver that SparkContext.taskScheduler is
created.
- TaskSchedulerIsSet() - Constructor for class org.apache.spark.TaskSchedulerIsSet
-
- TaskSorting - Enum in org.apache.spark.status.api.v1
-
- taskStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskStartToJson(SparkListenerTaskStart) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskState - Class in org.apache.spark
-
- TaskState() - Constructor for class org.apache.spark.TaskState
-
- taskSucceeded(int, Object) - Method in interface org.apache.spark.scheduler.JobListener
-
- taskTime() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- taskTime() - Method in class org.apache.spark.status.LiveExecutorStageSummary
-
- taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- TEMP_DIR_SHUTDOWN_PRIORITY() - Static method in class org.apache.spark.util.ShutdownHookManager
-
The shutdown priority of temp directory must be lower than the SparkContext shutdown
priority.
- TEMP_LOCAL() - Static method in class org.apache.spark.storage.BlockId
-
- TEMP_SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
-
- tempFileWith(File) - Static method in class org.apache.spark.util.Utils
-
Returns a path of temporary file which is in the same directory with path
.
- TeradataDialect - Class in org.apache.spark.sql.jdbc
-
- TeradataDialect() - Constructor for class org.apache.spark.sql.jdbc.TeradataDialect
-
- Term - Interface in org.apache.spark.ml.feature
-
R formula terms.
- terminateProcess(Process, long) - Static method in class org.apache.spark.util.Utils
-
Terminates a process waiting for at most the specified duration.
- test(Dataset<Row>, String, String) - Static method in class org.apache.spark.ml.stat.ChiSquareTest
-
Conduct Pearson's independence test for every feature against the label.
- test(Dataset<?>, String, String, double...) - Static method in class org.apache.spark.ml.stat.KolmogorovSmirnovTest
-
Convenience function to conduct a one-sample, two-sided Kolmogorov-Smirnov test for probability
distribution equality.
- test(Dataset<?>, String, Function1<Object, Object>) - Static method in class org.apache.spark.ml.stat.KolmogorovSmirnovTest
-
- test(Dataset<?>, String, Function<Double, Double>) - Static method in class org.apache.spark.ml.stat.KolmogorovSmirnovTest
-
- test(Dataset<?>, String, String, Seq<Object>) - Static method in class org.apache.spark.ml.stat.KolmogorovSmirnovTest
-
- TEST() - Static method in class org.apache.spark.storage.BlockId
-
- TEST_ACCUM() - Static method in class org.apache.spark.InternalAccumulator
-
- testCommandAvailable(String) - Static method in class org.apache.spark.TestUtils
-
Test if a command is available.
- testOneSample(RDD<Object>, String, double...) - Static method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest
-
A convenience function that allows running the KS test for 1 set of sample data against
a named distribution
- testOneSample(RDD<Object>, Function1<Object, Object>) - Static method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest
-
- testOneSample(RDD<Object>, RealDistribution) - Static method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest
-
- testOneSample(RDD<Object>, String, Seq<Object>) - Static method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest
-
- TestResult<DF> - Interface in org.apache.spark.mllib.stat.test
-
Trait for hypothesis test results.
- TestUtils - Class in org.apache.spark
-
Utilities for tests.
- TestUtils() - Constructor for class org.apache.spark.TestUtils
-
- text(String...) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads text files and returns a DataFrame
whose schema starts with a string column named
"value", and followed by partitioned columns if there are any.
- text(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads text files and returns a DataFrame
whose schema starts with a string column named
"value", and followed by partitioned columns if there are any.
- text(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads text files and returns a DataFrame
whose schema starts with a string column named
"value", and followed by partitioned columns if there are any.
- text(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the DataFrame
in a text file at the specified path.
- text(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader
-
Loads text files and returns a DataFrame
whose schema starts with a string column named
"value", and followed by partitioned columns if there are any.
- textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFile(String, int) - Method in class org.apache.spark.SparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFile(String...) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads text files and returns a
Dataset
of String.
- textFile(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads text files and returns a
Dataset
of String.
- textFile(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads text files and returns a
Dataset
of String.
- textFile(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader
-
Loads text file(s) and returns a Dataset
of String.
- textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them as text files (using key as LongWritable, value
as Text and input format as TextInputFormat).
- textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them as text files (using key as LongWritable, value
as Text and input format as TextInputFormat).
- textResponderToServlet(Function1<HttpServletRequest, String>) - Static method in class org.apache.spark.ui.JettyUtils
-
- theta() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-
- theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
-
- theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
-
- theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- thisClassName() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
-
Hard-code class name string in case it changes in the future
- thisClassName() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
-
Hard-code class name string in case it changes in the future
- thisClassName() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- thisFormatVersion() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
- thisFormatVersion() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
-
- thisFormatVersion() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
-
- thisFormatVersion() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
- thisFormatVersion() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- threadId() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
-
- threadName() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
-
- ThreadSafeRpcEndpoint - Interface in org.apache.spark.rpc
-
A trait that requires RpcEnv thread-safely sending messages to it.
- ThreadStackTrace - Class in org.apache.spark.status.api.v1
-
- ThreadStackTrace(long, String, Thread.State, StackTrace, Option<Object>, String, Seq<String>) - Constructor for class org.apache.spark.status.api.v1.ThreadStackTrace
-
- threadState() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
-
- ThreadUtils - Class in org.apache.spark.util
-
- ThreadUtils() - Constructor for class org.apache.spark.util.ThreadUtils
-
- threshold() - Method in interface org.apache.spark.ml.classification.LinearSVCParams
-
Param for threshold in binary classification prediction.
- threshold() - Method in class org.apache.spark.ml.feature.Binarizer
-
Param for threshold used to binarize continuous features.
- threshold() - Method in interface org.apache.spark.ml.param.shared.HasThreshold
-
Param for threshold in binary classification prediction, in range [0, 1].
- threshold() - Method in class org.apache.spark.ml.tree.ContinuousSplit
-
- threshold() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
-
- threshold() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- threshold() - Method in class org.apache.spark.mllib.tree.model.Split
-
- thresholds() - Method in interface org.apache.spark.ml.param.shared.HasThresholds
-
Param for Thresholds in multi-class classification to adjust the probability of predicting each class.
- thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns thresholds in descending order.
- throwBalls(int, RDD<?>, double, DefaultPartitionCoalescer.PartitionLocations) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
-
- time(Function0<T>) - Method in class org.apache.spark.sql.SparkSession
-
Executes some code block and prints to stdout the time taken to execute the block.
- time() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
-
Time when the exception occurred
- time() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerStreamingStarted
-
- Time - Class in org.apache.spark.streaming
-
This is a simple class that represents an absolute instant of time.
- Time(long) - Constructor for class org.apache.spark.streaming.Time
-
- timeFromString(String, TimeUnit) - Static method in class org.apache.spark.internal.config.ConfigHelpers
-
- timeIt(int, Function0<BoxedUnit>, Option<Function0<BoxedUnit>>) - Static method in class org.apache.spark.util.Utils
-
Timing method based on iterations that permit JVM JIT optimization.
- timeout(Duration) - Method in class org.apache.spark.streaming.StateSpec
-
Set the duration after which the state of an idle key will be removed.
- TIMER() - Static method in class org.apache.spark.metrics.sink.StatsdMetricType
-
- times(Decimal, Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
-
- times(int) - Method in class org.apache.spark.streaming.Duration
-
- times(int, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Method executed for repeating a task for side effects.
- timestamp() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type timestamp.
- TIMESTAMP() - Static method in class org.apache.spark.sql.Encoders
-
An encoder for nullable timestamp type.
- timestamp() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
- TimestampType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the TimestampType object.
- TimestampType - Class in org.apache.spark.sql.types
-
The data type representing java.sql.Timestamp
values.
- TimestampType() - Constructor for class org.apache.spark.sql.types.TimestampType
-
- timeStringAsMs(String) - Static method in class org.apache.spark.util.Utils
-
Convert a time parameter such as (50s, 100ms, or 250us) to milliseconds for internal use.
- timeStringAsSeconds(String) - Static method in class org.apache.spark.util.Utils
-
Convert a time parameter such as (50s, 100ms, or 250us) to seconds for internal use.
- timeTakenMs(Function0<T>) - Static method in class org.apache.spark.util.Utils
-
Records the duration of running `body`.
- timeToString(long, TimeUnit) - Static method in class org.apache.spark.internal.config.ConfigHelpers
-
- TimeTrackingOutputStream - Class in org.apache.spark.storage
-
Intercepts write calls and tracks total time spent writing in order to update shuffle write
metrics.
- TimeTrackingOutputStream(ShuffleWriteMetrics, OutputStream) - Constructor for class org.apache.spark.storage.TimeTrackingOutputStream
-
- timeUnit() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- TIMING_DATA() - Static method in class org.apache.spark.api.r.SpecialLengths
-
- to(Time, Duration) - Method in class org.apache.spark.streaming.Time
-
- to_date(Column) - Static method in class org.apache.spark.sql.functions
-
Converts the column into DateType
by casting rules to DateType
.
- to_date(Column, String) - Static method in class org.apache.spark.sql.functions
-
Converts the column into a DateType
with a specified format
- to_json(Column, Map<String, String>) - Static method in class org.apache.spark.sql.functions
-
(Scala-specific) Converts a column containing a StructType
, ArrayType
or
a MapType
into a JSON string with the specified schema.
- to_json(Column, Map<String, String>) - Static method in class org.apache.spark.sql.functions
-
(Java-specific) Converts a column containing a StructType
, ArrayType
or
a MapType
into a JSON string with the specified schema.
- to_json(Column) - Static method in class org.apache.spark.sql.functions
-
Converts a column containing a StructType
, ArrayType
or
a MapType
into a JSON string with the specified schema.
- to_timestamp(Column) - Static method in class org.apache.spark.sql.functions
-
Converts to a timestamp by casting rules to TimestampType
.
- to_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions
-
Converts time string with the given pattern to timestamp.
- to_utc_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions
-
Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time
zone, and renders that time as a timestamp in UTC.
- to_utc_timestamp(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time
zone, and renders that time as a timestamp in UTC.
- toApacheCommonsStats(StatCounter) - Method in interface org.apache.spark.mllib.stat.test.StreamingTestMethod
-
Implicit adapter to convert between streaming summary statistics type and the type required by
the t-testing libraries.
- toApi() - Method in class org.apache.spark.status.LiveRDDDistribution
-
- toApi() - Method in class org.apache.spark.status.LiveStage
-
- toArray() - Method in class org.apache.spark.input.PortableDataStream
-
Read the file as a byte array
- toArray() - Method in class org.apache.spark.ml.linalg.DenseVector
-
- toArray() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts to a dense array in column major.
- toArray() - Method in class org.apache.spark.ml.linalg.SparseVector
-
- toArray() - Method in interface org.apache.spark.ml.linalg.Vector
-
Converts the instance to a double array.
- toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Converts to a dense array in column major.
- toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toArray() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts the instance to a double array.
- toBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
-
- toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to BlockMatrix.
- toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to BlockMatrix.
- toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Converts to BlockMatrix.
- toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Converts to BlockMatrix.
- toBoolean(String, String) - Static method in class org.apache.spark.internal.config.ConfigHelpers
-
- toBooleanArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
-
Collects data and assembles a local dense breeze matrix (for test only).
- toByte() - Method in class org.apache.spark.sql.types.Decimal
-
- toByteArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- toByteArray() - Method in class org.apache.spark.util.sketch.CountMinSketch
-
- toByteBuffer() - Method in interface org.apache.spark.storage.BlockData
-
- toByteBuffer() - Method in class org.apache.spark.storage.DiskBlockData
-
- toCatalystDecimal(HiveDecimalObjectInspector, Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- toChunkedByteBuffer(Function1<Object, ByteBuffer>) - Method in interface org.apache.spark.storage.BlockData
-
- toChunkedByteBuffer(Function1<Object, ByteBuffer>) - Method in class org.apache.spark.storage.DiskBlockData
-
- toColumn() - Method in class org.apache.spark.sql.expressions.Aggregator
-
Returns this Aggregator
as a TypedColumn
that can be used in Dataset
.
- toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Converts to CoordinateMatrix.
- toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- toCryptoConf(SparkConf) - Static method in class org.apache.spark.security.CryptoStreamUtils
-
- toDDL() - Method in class org.apache.spark.sql.types.StructField
-
Returns a string containing a schema in DDL format.
- toDDL() - Method in class org.apache.spark.sql.types.StructType
-
Returns a string containing a schema in DDL format.
- toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
A description of this RDD and its recursive dependencies for debugging.
- toDebugString() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel
-
Full description of model
- toDebugString() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel
-
Full description of model
- toDebugString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Print the full model to a string.
- toDebugString() - Method in class org.apache.spark.rdd.RDD
-
A description of this RDD and its recursive dependencies for debugging.
- toDebugString() - Method in class org.apache.spark.SparkConf
-
Return a string listing all keys and values, one per line.
- toDebugString() - Method in class org.apache.spark.sql.types.Decimal
-
- toDegrees(Column) - Static method in class org.apache.spark.sql.functions
-
- toDegrees(String) - Static method in class org.apache.spark.sql.functions
-
- toDense() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts this matrix to a dense matrix while maintaining the layout of the current matrix.
- toDense() - Method in interface org.apache.spark.ml.linalg.Vector
-
Converts this vector to a dense vector.
- toDense() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a DenseMatrix
from the given SparseMatrix
.
- toDense() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts this vector to a dense vector.
- toDenseColMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts this matrix to a dense matrix in column major order.
- toDenseMatrix(boolean) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts this matrix to a dense matrix.
- toDenseRowMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts this matrix to a dense matrix in row major order.
- toDF(String...) - Method in class org.apache.spark.sql.Dataset
-
Converts this strongly typed collection of data to generic DataFrame
with columns renamed.
- toDF() - Method in class org.apache.spark.sql.Dataset
-
Converts this strongly typed collection of data to generic Dataframe.
- toDF(Seq<String>) - Method in class org.apache.spark.sql.Dataset
-
Converts this strongly typed collection of data to generic DataFrame
with columns renamed.
- toDF() - Method in class org.apache.spark.sql.DatasetHolder
-
- toDF(Seq<String>) - Method in class org.apache.spark.sql.DatasetHolder
-
- toDouble(Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
-
- toDouble() - Method in class org.apache.spark.sql.types.Decimal
-
- toDoubleArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- toDS() - Method in class org.apache.spark.sql.DatasetHolder
-
- toEdgeTriplet() - Method in class org.apache.spark.graphx.EdgeContext
-
Converts the edge and vertex properties into an
EdgeTriplet
for convenience.
- toErrorString() - Method in class org.apache.spark.ExceptionFailure
-
- toErrorString() - Method in class org.apache.spark.ExecutorLostFailure
-
- toErrorString() - Method in class org.apache.spark.FetchFailed
-
- toErrorString() - Static method in class org.apache.spark.Resubmitted
-
- toErrorString() - Method in class org.apache.spark.TaskCommitDenied
-
- toErrorString() - Method in interface org.apache.spark.TaskFailedReason
-
Error message displayed in the web UI.
- toErrorString() - Method in class org.apache.spark.TaskKilled
-
- toErrorString() - Static method in class org.apache.spark.TaskResultLost
-
- toErrorString() - Static method in class org.apache.spark.UnknownReason
-
- toFloat(Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
-
- toFloat() - Method in class org.apache.spark.sql.types.Decimal
-
- toFloatArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- toFormattedString() - Method in class org.apache.spark.streaming.Duration
-
- toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Converts to IndexedRowMatrix.
- toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to IndexedRowMatrix.
- toInputStream() - Method in interface org.apache.spark.storage.BlockData
-
- toInputStream() - Method in class org.apache.spark.storage.DiskBlockData
-
- toInspector(DataType) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- toInspector(Expression) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
Map the catalyst expression to ObjectInspector, however,
if the expression is Literal
or foldable, a constant writable object inspector returns;
Otherwise, we always get the object inspector according to its data type(in catalyst)
- toInspector(DataType) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- toInspector(Expression) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- toInt(Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
-
- toInt() - Method in class org.apache.spark.sql.types.Decimal
-
- toInt() - Method in class org.apache.spark.storage.StorageLevel
-
- toIntArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- toJavaBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
-
- toJavaBigInteger() - Method in class org.apache.spark.sql.types.Decimal
-
- toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Convert to a JavaDStream
- toJavaRDD() - Method in class org.apache.spark.rdd.RDD
-
- toJavaRDD() - Method in class org.apache.spark.sql.Dataset
-
Returns the content of the Dataset as a JavaRDD
of T
s.
- toJson(Matrix) - Static method in class org.apache.spark.ml.linalg.JsonMatrixConverter
-
Coverts the Matrix to a JSON string.
- toJson(Vector) - Static method in class org.apache.spark.ml.linalg.JsonVectorConverter
-
Coverts the vector to a JSON string.
- toJson() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toJson() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toJson() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts the vector to a JSON string.
- toJSON() - Method in class org.apache.spark.sql.Dataset
-
Returns the content of the Dataset as a Dataset of JSON strings.
- Tokenizer - Class in org.apache.spark.ml.feature
-
A tokenizer that converts the input string to lowercase and then splits it by white spaces.
- Tokenizer(String) - Constructor for class org.apache.spark.ml.feature.Tokenizer
-
- Tokenizer() - Constructor for class org.apache.spark.ml.feature.Tokenizer
-
- tokens() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens
-
- tol() - Method in interface org.apache.spark.ml.param.shared.HasTol
-
Param for the convergence tolerance for iterative algorithms (>= 0).
- toLocal() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
-
Convert this distributed model to a local representation.
- toLocal() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Convert model to a local model.
- toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an iterator that contains all of the elements in this RDD.
- toLocalIterator() - Method in class org.apache.spark.rdd.RDD
-
Return an iterator that contains all of the elements in this RDD.
- toLocalIterator() - Method in class org.apache.spark.sql.Dataset
-
Returns an iterator that contains all rows in this Dataset.
- toLocalMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Collect the distributed matrix on the driver as a DenseMatrix
.
- toLong(Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
-
- toLong() - Method in class org.apache.spark.sql.types.Decimal
-
- toLongArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- toLowercase() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
Indicates whether to convert all characters to lowercase before tokenizing.
- toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute
-
Converts to ML metadata with some existing metadata.
- toMetadata() - Method in class org.apache.spark.ml.attribute.Attribute
-
Converts to ML metadata
- toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Converts to ML metadata with some existing metadata.
- toMetadata() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Converts to ML metadata
- toMetadata(Metadata) - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- toMetadata() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- toNetty() - Method in interface org.apache.spark.storage.BlockData
-
Returns a Netty-friendly wrapper for the block's data.
- toNetty() - Method in class org.apache.spark.storage.DiskBlockData
-
Returns a Netty-friendly wrapper for the block's data.
- toNumber(String, Function1<String, T>, String, String) - Static method in class org.apache.spark.internal.config.ConfigHelpers
-
- toOld() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel
-
Convert to spark.mllib DecisionTreeModel (losing some information)
- toOld() - Method in interface org.apache.spark.ml.tree.Split
-
Convert to old Split format
- tooltip(String, String) - Static method in class org.apache.spark.ui.UIUtils
-
- ToolTips - Class in org.apache.spark.ui
-
- ToolTips() - Constructor for class org.apache.spark.ui.ToolTips
-
- toOps(T, ClassTag<VD>) - Method in interface org.apache.spark.graphx.impl.VertexPartitionBaseOpsConstructor
-
- top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the top k (largest) elements from this RDD as defined by
the specified Comparator[T] and maintains the order.
- top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the top k (largest) elements from this RDD using the
natural ordering for T and maintains the order.
- top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the top k (largest) elements from this RDD as defined by the specified
implicit Ordering[T] and maintains the ordering.
- toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.dstream.DStream
-
- topByKey(int, Ordering<V>) - Method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions
-
Returns the top k (largest) elements for each key from this RDD as defined by the specified
implicit Ordering[T].
- topDocumentsPerTopic(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Return the top documents for each topic
- topicAssignments() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Return the top topic for each (doc, term) pair.
- topicConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
distributions over terms.
- topicConcentration() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
- topicConcentration() - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
distributions over terms.
- topicConcentration() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- topicDistribution(Vector) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
Predicts the topic mixture distribution for a document (often called "theta" in the
literature).
- topicDistributionCol() - Method in interface org.apache.spark.ml.clustering.LDAParams
-
Output column with estimates of the topic mixture distribution for each document (often called
"theta" in the literature).
- topicDistributions() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
For each document in the training set, return the distribution over topics for that document
("theta_doc").
- topicDistributions(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
Predicts the topic mixture distribution for each document (often called "theta" in the
literature).
- topicDistributions(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
Java-friendly version of topicDistributions
- topics() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- topicsMatrix() - Method in class org.apache.spark.ml.clustering.LDAModel
-
Inferred topics, where each topic is represented by a distribution over terms.
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Inferred topics, where each topic is represented by a distribution over terms.
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Inferred topics, where each topic is represented by a distribution over terms.
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- topK(Iterator<Tuple2<String, Object>>, int) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
Gets the top k words in terms of word counts.
- toPMML(StreamResult) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
Export the model to the stream result in PMML format
- toPMML(String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
Export the model to a local file in PMML format
- toPMML(SparkContext, String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
Export the model to a directory on a distributed file system in PMML format
- toPMML(OutputStream) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
Export the model to the OutputStream in PMML format
- toPMML() - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
Export the model to a String in PMML format
- topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- Topology - Interface in org.apache.spark.ml.ann
-
Trait for the artificial neural network (ANN) topology properties
- topologyFile() - Method in class org.apache.spark.storage.FileBasedTopologyMapper
-
- topologyInfo() - Method in class org.apache.spark.storage.BlockManagerId
-
- topologyMap() - Method in class org.apache.spark.storage.FileBasedTopologyMapper
-
- TopologyMapper - Class in org.apache.spark.storage
-
::DeveloperApi::
TopologyMapper provides topology information for a given host
param: conf SparkConf to get required properties, if needed
- TopologyMapper(SparkConf) - Constructor for class org.apache.spark.storage.TopologyMapper
-
- TopologyModel - Interface in org.apache.spark.ml.ann
-
Trait for ANN topology model
- toPredict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- topTopicsPerDocument(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
For each document, return the top k weighted topics for that document and their weights.
- toRadians(Column) - Static method in class org.apache.spark.sql.functions
-
- toRadians(String) - Static method in class org.apache.spark.sql.functions
-
- toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-
- toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to RowMatrix, dropping row indices after grouping by row index.
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Drops row indices and converts this matrix to a
RowMatrix
.
- toScalaBigInt() - Method in class org.apache.spark.sql.types.Decimal
-
- toSeq() - Method in class org.apache.spark.ml.param.ParamMap
-
Converts this param map to a sequence of param pairs.
- toSeq() - Method in interface org.apache.spark.sql.Row
-
Return a Scala Seq representing the row.
- toShort() - Method in class org.apache.spark.sql.types.Decimal
-
- toShortArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
-
- toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-
- toSparse() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts this matrix to a sparse matrix while maintaining the layout of the current matrix.
- toSparse() - Method in interface org.apache.spark.ml.linalg.Vector
-
Converts this vector to a sparse vector with all explicit zeros removed.
- toSparse() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a SparseMatrix
from the given DenseMatrix
.
- toSparse() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts this vector to a sparse vector with all explicit zeros removed.
- toSparseColMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts this matrix to a sparse matrix in column major order.
- toSparseMatrix(boolean) - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts this matrix to a sparse matrix.
- toSparseRowMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Converts this matrix to a sparse matrix in row major order.
- toSparseWithSize(int) - Method in interface org.apache.spark.ml.linalg.Vector
-
Converts this vector to a sparse vector with all explicit zeros removed when the size is known.
- toSparseWithSize(int) - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts this vector to a sparse vector with all explicit zeros removed when the size is known.
- toSplit() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-
- toString() - Method in class org.apache.spark.Accumulable
-
Deprecated.
- toString() - Method in class org.apache.spark.api.java.JavaRDD
-
- toString() - Method in class org.apache.spark.api.java.Optional
-
- toString() - Method in class org.apache.spark.broadcast.Broadcast
-
- toString() - Static method in class org.apache.spark.CleanAccum
-
- toString() - Static method in class org.apache.spark.CleanBroadcast
-
- toString() - Static method in class org.apache.spark.CleanCheckpoint
-
- toString() - Static method in class org.apache.spark.CleanRDD
-
- toString() - Static method in class org.apache.spark.CleanShuffle
-
- toString() - Method in class org.apache.spark.ContextBarrierId
-
- toString() - Static method in class org.apache.spark.ExceptionFailure
-
- toString() - Static method in class org.apache.spark.ExecutorLostFailure
-
- toString() - Static method in class org.apache.spark.ExecutorRegistered
-
- toString() - Static method in class org.apache.spark.ExecutorRemoved
-
- toString() - Static method in class org.apache.spark.FetchFailed
-
- toString() - Method in class org.apache.spark.graphx.EdgeDirection
-
- toString() - Method in class org.apache.spark.graphx.EdgeTriplet
-
- toString() - Method in class org.apache.spark.ml.attribute.Attribute
-
- toString() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
- toString() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- toString() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- toString() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- toString() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- toString() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
-
- toString() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- toString() - Static method in class org.apache.spark.ml.clustering.ClusterData
-
- toString() - Method in class org.apache.spark.ml.feature.LabeledPoint
-
- toString() - Method in class org.apache.spark.ml.feature.RFormula
-
- toString() - Method in class org.apache.spark.ml.feature.RFormulaModel
-
- toString() - Method in class org.apache.spark.ml.linalg.DenseVector
-
- toString() - Method in interface org.apache.spark.ml.linalg.Matrix
-
A human readable representation of the matrix
- toString(int, int) - Method in interface org.apache.spark.ml.linalg.Matrix
-
A human readable representation of the matrix with maximum lines and width
- toString() - Method in class org.apache.spark.ml.linalg.SparseVector
-
- toString() - Method in class org.apache.spark.ml.param.Param
-
- toString() - Method in class org.apache.spark.ml.param.ParamMap
-
- toString() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- toString() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- toString() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary
-
- toString() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- toString() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel
-
Summary of the model
- toString() - Method in class org.apache.spark.ml.tree.InternalNode
-
- toString() - Method in class org.apache.spark.ml.tree.LeafNode
-
- toString() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel
-
Summary of the model
- toString() - Method in interface org.apache.spark.ml.util.Identifiable
-
- toString() - Static method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
-
- toString() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- toString() - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
-
- toString() - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
-
- toString() - Method in class org.apache.spark.mllib.classification.SVMModel
-
- toString() - Static method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data
-
- toString() - Static method in class org.apache.spark.mllib.feature.VocabWord
-
- toString() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
-
- toString() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-
- toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toString() - Static method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
-
- toString() - Static method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
-
- toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
A human readable representation of the matrix
- toString(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
A human readable representation of the matrix with maximum lines and width
- toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toString() - Static method in class org.apache.spark.mllib.recommendation.Rating
-
- toString() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Print a summary of the model.
- toString() - Static method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
-
- toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- toString() - Method in class org.apache.spark.mllib.stat.test.BinarySample
-
- toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- toString() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
-
- toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
String explaining the hypothesis test result.
- toString() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- toString() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
-
- toString() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- toString() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Print a summary of the model.
- toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Node
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Predict
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Split
-
- toString() - Method in class org.apache.spark.partial.BoundedDouble
-
- toString() - Method in class org.apache.spark.partial.PartialResult
-
- toString() - Static method in class org.apache.spark.rdd.CheckpointState
-
- toString() - Static method in class org.apache.spark.rdd.DeterministicLevel
-
- toString() - Method in class org.apache.spark.rdd.RDD
-
- toString() - Static method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- toString() - Static method in class org.apache.spark.scheduler.BlacklistedExecutor
-
- toString() - Static method in class org.apache.spark.scheduler.ExecutorKilled
-
- toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- toString() - Static method in class org.apache.spark.scheduler.local.KillTask
-
- toString() - Static method in class org.apache.spark.scheduler.local.ReviveOffers
-
- toString() - Static method in class org.apache.spark.scheduler.local.StatusUpdate
-
- toString() - Static method in class org.apache.spark.scheduler.local.StopExecutor
-
- toString() - Static method in class org.apache.spark.scheduler.LossReasonPending
-
- toString() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerBlockUpdated
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerJobEnd
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerLogStart
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerSpeculativeTaskSubmitted
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- toString() - Static method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- toString() - Method in class org.apache.spark.scheduler.SplitInfo
-
- toString() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- toString() - Method in class org.apache.spark.SerializableWritable
-
- toString() - Method in class org.apache.spark.sql.catalog.Column
-
- toString() - Method in class org.apache.spark.sql.catalog.Database
-
- toString() - Method in class org.apache.spark.sql.catalog.Function
-
- toString() - Method in class org.apache.spark.sql.catalog.Table
-
- toString() - Method in class org.apache.spark.sql.Column
-
- toString() - Method in class org.apache.spark.sql.Dataset
-
- toString() - Static method in class org.apache.spark.sql.expressions.UserDefinedFunction
-
- toString() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
-
- toString() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
-
- toString() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- toString() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
-
- toString() - Static method in class org.apache.spark.sql.hive.HiveUDAFBuffer
-
- toString() - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- toString() - Static method in class org.apache.spark.sql.hive.RelationConversions
-
- toString() - Static method in class org.apache.spark.sql.jdbc.JdbcType
-
- toString() - Method in class org.apache.spark.sql.KeyValueGroupedDataset
-
- toString() - Method in interface org.apache.spark.sql.RelationalGroupedDataset.GroupType
-
- toString() - Method in class org.apache.spark.sql.RelationalGroupedDataset
-
- toString() - Method in interface org.apache.spark.sql.Row
-
- toString() - Static method in class org.apache.spark.sql.sources.And
-
- toString() - Static method in class org.apache.spark.sql.sources.EqualNullSafe
-
- toString() - Static method in class org.apache.spark.sql.sources.EqualTo
-
- toString() - Static method in class org.apache.spark.sql.sources.GreaterThan
-
- toString() - Static method in class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- toString() - Method in class org.apache.spark.sql.sources.In
-
- toString() - Static method in class org.apache.spark.sql.sources.IsNotNull
-
- toString() - Static method in class org.apache.spark.sql.sources.IsNull
-
- toString() - Static method in class org.apache.spark.sql.sources.LessThan
-
- toString() - Static method in class org.apache.spark.sql.sources.LessThanOrEqual
-
- toString() - Static method in class org.apache.spark.sql.sources.Not
-
- toString() - Static method in class org.apache.spark.sql.sources.Or
-
- toString() - Static method in class org.apache.spark.sql.sources.StringContains
-
- toString() - Static method in class org.apache.spark.sql.sources.StringEndsWith
-
- toString() - Static method in class org.apache.spark.sql.sources.StringStartsWith
-
- toString() - Method in class org.apache.spark.sql.sources.v2.reader.streaming.Offset
-
- toString() - Method in class org.apache.spark.sql.streaming.SinkProgress
-
- toString() - Method in class org.apache.spark.sql.streaming.SourceProgress
-
- toString() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress
-
- toString() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
-
- toString() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
-
- toString() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus
-
- toString() - Static method in class org.apache.spark.sql.types.CharType
-
- toString() - Method in class org.apache.spark.sql.types.Decimal
-
- toString() - Method in class org.apache.spark.sql.types.DecimalType
-
- toString() - Method in class org.apache.spark.sql.types.Metadata
-
- toString() - Method in class org.apache.spark.sql.types.StructField
-
- toString() - Static method in class org.apache.spark.sql.types.VarcharType
-
- toString() - Static method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- toString() - Static method in class org.apache.spark.status.api.v1.ApplicationInfo
-
- toString() - Method in class org.apache.spark.status.api.v1.StackTrace
-
- toString() - Static method in class org.apache.spark.status.api.v1.ThreadStackTrace
-
- toString() - Method in class org.apache.spark.storage.BlockId
-
- toString() - Method in class org.apache.spark.storage.BlockManagerId
-
- toString() - Static method in class org.apache.spark.storage.BroadcastBlockId
-
- toString() - Static method in class org.apache.spark.storage.RDDBlockId
-
- toString() - Method in class org.apache.spark.storage.RDDInfo
-
- toString() - Static method in class org.apache.spark.storage.ShuffleBlockId
-
- toString() - Static method in class org.apache.spark.storage.ShuffleDataBlockId
-
- toString() - Static method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- toString() - Method in class org.apache.spark.storage.StorageLevel
-
- toString() - Static method in class org.apache.spark.storage.StreamBlockId
-
- toString() - Static method in class org.apache.spark.storage.TaskResultBlockId
-
- toString() - Method in class org.apache.spark.streaming.Duration
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerStreamingStarted
-
- toString() - Method in class org.apache.spark.streaming.State
-
- toString() - Method in class org.apache.spark.streaming.Time
-
- toString() - Static method in class org.apache.spark.TaskCommitDenied
-
- toString() - Static method in class org.apache.spark.TaskKilled
-
- toString() - Static method in class org.apache.spark.TaskState
-
- toString() - Method in class org.apache.spark.util.AccumulatorV2
-
- toString() - Method in class org.apache.spark.util.MutablePair
-
- toString() - Method in class org.apache.spark.util.StatCounter
-
- toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute
-
Converts to a StructField
with some existing metadata.
- toStructField() - Method in class org.apache.spark.ml.attribute.Attribute
-
Converts to a StructField
.
- toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Converts to a StructField with some existing metadata.
- toStructField() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Converts to a StructField.
- toStructField(Metadata) - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- toStructField() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- totalBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- totalBytesRead(ShuffleReadMetrics) - Static method in class org.apache.spark.ui.jobs.ApiHelper
-
- totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-
- totalCores() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalCores() - Method in class org.apache.spark.status.LiveExecutor
-
- totalCount() - Method in class org.apache.spark.util.sketch.CountMinSketch
-
- totalDelay() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
-
- totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for all the jobs of this batch to finish processing from the time they
were submitted.
- totalDiskSize() - Method in class org.apache.spark.ui.storage.ExecutorStreamSummary
-
- totalDuration() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalDuration() - Method in class org.apache.spark.status.LiveExecutor
-
- totalGCTime() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalGcTime() - Method in class org.apache.spark.status.LiveExecutor
-
- totalInputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalInputBytes() - Method in class org.apache.spark.status.LiveExecutor
-
- totalIterations() - Method in interface org.apache.spark.ml.classification.LogisticRegressionTrainingSummary
-
Number of training iterations.
- totalIterations() - Method in class org.apache.spark.ml.regression.LinearRegressionTrainingSummary
-
Number of training iterations until termination
- totalMemSize() - Method in class org.apache.spark.ui.storage.ExecutorStreamSummary
-
- totalNumNodes() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel
-
Total number of nodes, summed over all trees in the ensemble.
- totalOffHeap() - Method in class org.apache.spark.status.LiveExecutor
-
- totalOffHeapStorageMemory() - Method in interface org.apache.spark.SparkExecutorInfo
-
- totalOffHeapStorageMemory() - Method in class org.apache.spark.SparkExecutorInfoImpl
-
- totalOffHeapStorageMemory() - Method in class org.apache.spark.status.api.v1.MemoryMetrics
-
- totalOnHeap() - Method in class org.apache.spark.status.LiveExecutor
-
- totalOnHeapStorageMemory() - Method in interface org.apache.spark.SparkExecutorInfo
-
- totalOnHeapStorageMemory() - Method in class org.apache.spark.SparkExecutorInfoImpl
-
- totalOnHeapStorageMemory() - Method in class org.apache.spark.status.api.v1.MemoryMetrics
-
- totalShuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalShuffleRead() - Method in class org.apache.spark.status.LiveExecutor
-
- totalShuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalShuffleWrite() - Method in class org.apache.spark.status.LiveExecutor
-
- totalTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalTasks() - Method in class org.apache.spark.status.LiveExecutor
-
- toTuple() - Method in class org.apache.spark.graphx.EdgeTriplet
-
- toTypeInfo() - Method in class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
-
- toUnscaledLong() - Method in class org.apache.spark.sql.types.Decimal
-
- toVirtualHosts(Seq<String>) - Static method in class org.apache.spark.ui.JettyUtils
-
- train(RDD<ALS.Rating<ID>>, int, int, int, int, double, boolean, double, boolean, StorageLevel, StorageLevel, int, long, ClassTag<ID>, Ordering<ID>) - Static method in class org.apache.spark.ml.recommendation.ALS
-
Developer API
Implementation of the ALS algorithm.
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, double, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<Vector>, int, int, String, long) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using the given set of parameters.
- train(RDD<Vector>, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using the given set of parameters.
- train(RDD<Vector>, int, int, int, String, long) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
- train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
- train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using specified parameters and the default values for unspecified.
- train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
- train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings by users for a subset of products.
- train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings by users for a subset of products.
- train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings by users for a subset of products.
- train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings by users for a subset of products.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a Linear Regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Method to train a gradient boosting model.
- train(JavaRDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees.train
- trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Java-friendly API for org.apache.spark.mllib.tree.DecisionTree.trainClassifier
- trainClassifier(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Java-friendly API for org.apache.spark.mllib.tree.RandomForest.trainClassifier
- trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users
to some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' of users for a
subset of products.
- trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' of users for a
subset of products.
- trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' of users for a
subset of products.
- trainingCost() - Method in class org.apache.spark.ml.clustering.KMeansSummary
-
- trainingCost() - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
- trainingLogLikelihood() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
-
Log likelihood of the observed tokens in the training set,
given the current parameter estimates:
log P(docs | topics, topic distributions for docs, Dirichlet hyperparameters)
- trainOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Update the clustering model by training on batches of data from a DStream.
- trainOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Java-friendly version of trainOn
.
- trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Update the model by training on batches of data from a DStream.
- trainOn(JavaDStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Java-friendly version of trainOn
.
- trainRatio() - Method in interface org.apache.spark.ml.tuning.TrainValidationSplitParams
-
Param for ratio between train and validation data.
- trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model for regression.
- trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Java-friendly API for org.apache.spark.mllib.tree.DecisionTree.trainRegressor
- trainRegressor(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for regression.
- trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for regression.
- trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Java-friendly API for org.apache.spark.mllib.tree.RandomForest.trainRegressor
- TrainValidationSplit - Class in org.apache.spark.ml.tuning
-
Validation for hyper-parameter tuning.
- TrainValidationSplit(String) - Constructor for class org.apache.spark.ml.tuning.TrainValidationSplit
-
- TrainValidationSplit() - Constructor for class org.apache.spark.ml.tuning.TrainValidationSplit
-
- TrainValidationSplitModel - Class in org.apache.spark.ml.tuning
-
Model from train validation split.
- TrainValidationSplitModel.TrainValidationSplitModelWriter - Class in org.apache.spark.ml.tuning
-
Writer for TrainValidationSplitModel.
- TrainValidationSplitParams - Interface in org.apache.spark.ml.tuning
-
- transferred() - Method in class org.apache.spark.storage.ReadableChannelFileRegion
-
- transferTo(WritableByteChannel, long) - Method in class org.apache.spark.storage.ReadableChannelFileRegion
-
- transform(Function1<Try<T>, Try<S>>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
-
- transform(Function1<Try<T>, Try<S>>, ExecutionContext) - Method in interface org.apache.spark.FutureAction
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.classification.ClassificationModel
-
Transforms dataset by reading from featuresCol
, and appending new columns as specified by
parameters:
- predicted labels as predictionCol
of type Double
- raw predictions (confidences) as rawPredictionCol
of type Vector
.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-
Transforms dataset by reading from featuresCol
, and appending new columns as specified by
parameters:
- predicted labels as predictionCol
of type Double
- raw predictions (confidences) as rawPredictionCol
of type Vector
- probability of each class as probabilityCol
of type Vector
.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.clustering.KMeansModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.clustering.LDAModel
-
Transforms the input dataset.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.Binarizer
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.ColumnPruner
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.HashingTF
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.IDFModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.ImputerModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.IndexToString
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.Interaction
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.PCAModel
-
Transform a vector by computed Principal Components.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.RFormulaModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.SQLTransformer
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorSizeHint
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorSlicer
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
Transform a sentence column to a vector column to represent the whole sentence.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
-
The transform method first generates the association rules according to the frequent itemsets.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.PipelineModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.PredictionModel
-
Transforms dataset by reading from featuresCol
, calling predict
, and storing
the predictions as a new column predictionCol
.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-
- transform(Dataset<?>, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer
-
Transforms the dataset with optional parameters
- transform(Dataset<?>, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer
-
Transforms the dataset with optional parameters
- transform(Dataset<?>, ParamMap) - Method in class org.apache.spark.ml.Transformer
-
Transforms the dataset with provided parameter map as additional parameters.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.Transformer
-
Transforms the input dataset.
- transform(Dataset<?>) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
-
- transform(Dataset<?>) - Method in class org.apache.spark.ml.UnaryTransformer
-
- transform(Vector) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
Applies transformation on a vector.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.ElementwiseProduct
-
Does the hadamard product transformation.
- transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document into a sparse term frequency vector.
- transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document into a sparse term frequency vector (Java version).
- transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document to term frequency vectors.
- transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document to term frequency vectors (Java version).
- transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms term frequency (TF) vectors to TF-IDF vectors.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms a term frequency (TF) vector to a TF-IDF vector
- transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
- transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer
-
Applies unit length normalization on a vector.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.PCAModel
-
Transform a vector by computed Principal Components.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
Applies standardization transformation on a vector.
- transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on a vector.
- transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on an RDD[Vector].
- transform(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on a JavaRDD[Vector].
- transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Transforms a word to its vector representation
- transform(Function1<Try<T>, Try<S>>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
-
- transform(Function1<Dataset<T>, Dataset<U>>) - Method in class org.apache.spark.sql.Dataset
-
Concise syntax for chaining custom transformations.
- transform(Function<R, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Function2<R, Time, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transform(Function1<RDD<T>, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Function2<RDD<T>, Time, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- Transformer - Class in org.apache.spark.ml
-
Developer API
Abstract class for transformers that transform one dataset into another.
- Transformer() - Constructor for class org.apache.spark.ml.Transformer
-
- transformOutputColumnSchema(StructField, String, boolean, boolean) - Static method in class org.apache.spark.ml.feature.OneHotEncoderCommon
-
Prepares the StructField
with proper metadata for OneHotEncoder
's output column.
- transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.GaussianMixture
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.KMeans
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.KMeansModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.LDA
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.LDAModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Binarizer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ChiSqSelector
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ColumnPruner
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.CountVectorizer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.FeatureHasher
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.HashingTF
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDF
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDFModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Imputer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ImputerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IndexToString
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Interaction
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinHashLSH
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinMaxScaler
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
Deprecated.
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.PCA
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.PCAModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.RFormula
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.RFormulaModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.SQLTransformer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StopWordsRemover
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorSizeHint
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorSlicer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.fpm.FPGrowth
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.Pipeline
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineStage
-
Developer API
- transformSchema(StructType) - Method in class org.apache.spark.ml.PredictionModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.Predictor
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALS
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.regression.IsotonicRegression
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.UnaryTransformer
-
- transformSchemaImpl(StructType) - Method in interface org.apache.spark.ml.tuning.ValidatorParams
-
- transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transformWith(Function1<Try<T>, Future<S>>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
-
- transformWith(Function1<Try<T>, Future<S>>, ExecutionContext) - Method in interface org.apache.spark.FutureAction
-
- transformWith(Function1<Try<T>, Future<S>>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
-
- transformWith(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(DStream<U>, Function2<RDD<T>, RDD<U>, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(DStream<U>, Function3<RDD<T>, RDD<U>, Time, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWithToPair(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- translate(Column, String, String) - Static method in class org.apache.spark.sql.functions
-
Translate any character in the src by a character in replaceString.
- transpose() - Method in class org.apache.spark.ml.linalg.DenseMatrix
-
- transpose() - Method in interface org.apache.spark.ml.linalg.Matrix
-
Transpose the Matrix.
- transpose() - Method in class org.apache.spark.ml.linalg.SparseMatrix
-
- transpose() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- transpose() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Transpose this BlockMatrix
.
- transpose() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Transposes this CoordinateMatrix.
- transpose() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Transpose the Matrix.
- transpose() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregates the elements of this RDD in a multi-level tree pattern.
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
org.apache.spark.api.java.JavaRDDLike.treeAggregate
with suggested depth 2.
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Aggregates the elements of this RDD in a multi-level tree pattern.
- TreeClassifierParams - Interface in org.apache.spark.ml.tree
-
Parameters for Decision Tree-based classification algorithms.
- TreeEnsembleModel<M extends DecisionTreeModel> - Interface in org.apache.spark.ml.tree
-
Abstraction for models which are ensembles of decision trees
- TreeEnsembleParams - Interface in org.apache.spark.ml.tree
-
Parameters for Decision Tree-based ensemble algorithms.
- treeID() - Method in class org.apache.spark.ml.tree.EnsembleModelReadWrite.EnsembleNodeData
-
- treeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- treeReduce(Function2<T, T, T>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Reduces the elements of this RDD in a multi-level tree pattern.
- treeReduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
org.apache.spark.api.java.JavaRDDLike.treeReduce
with suggested depth 2.
- treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.rdd.RDD
-
Reduces the elements of this RDD in a multi-level tree pattern.
- TreeRegressorParams - Interface in org.apache.spark.ml.tree
-
Parameters for Decision Tree-based regression algorithms.
- trees() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- trees() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- trees() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- trees() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- trees() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel
-
Trees in this ensemble.
- trees() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- trees() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- treeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- treeString() - Method in class org.apache.spark.sql.types.StructType
-
- treeWeights() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- treeWeights() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- treeWeights() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- treeWeights() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- treeWeights() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel
-
Weights for each tree, zippable with trees
- treeWeights() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- triangleCount() - Method in class org.apache.spark.graphx.GraphOps
-
Compute the number of triangles passing through each vertex.
- TriangleCount - Class in org.apache.spark.graphx.lib
-
Compute the number of triangles passing through each vertex.
- TriangleCount() - Constructor for class org.apache.spark.graphx.lib.TriangleCount
-
- trigger(Trigger) - Method in class org.apache.spark.sql.streaming.DataStreamWriter
-
Set the trigger for the stream query.
- Trigger - Class in org.apache.spark.sql.streaming
-
Policy used to indicate how often results should be produced by a [[StreamingQuery]].
- Trigger() - Constructor for class org.apache.spark.sql.streaming.Trigger
-
- TriggerThreadDump$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.TriggerThreadDump$
-
- trim(Column) - Static method in class org.apache.spark.sql.functions
-
Trim the spaces from both ends for the specified string column.
- trim(Column, String) - Static method in class org.apache.spark.sql.functions
-
Trim the specified character from both ends for the specified string column.
- TrimHorizon() - Constructor for class org.apache.spark.streaming.kinesis.KinesisInitialPositions.TrimHorizon
-
- TripletFields - Class in org.apache.spark.graphx
-
Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].
- TripletFields() - Constructor for class org.apache.spark.graphx.TripletFields
-
Constructs a default TripletFields in which all fields are included.
- TripletFields(boolean, boolean, boolean) - Constructor for class org.apache.spark.graphx.TripletFields
-
- triplets() - Method in class org.apache.spark.graphx.Graph
-
An RDD containing the edge triplets, which are edges along with the vertex data associated with
the adjacent vertices.
- triplets() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
Return an RDD that brings edges together with their source and destination vertices.
- truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns true positive rate for a given label (category)
- truePositiveRateByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
-
Returns true positive rate for each label (category).
- trunc(Column, String) - Static method in class org.apache.spark.sql.functions
-
Returns date truncated to the unit specified by the format.
- truncatedString(Seq<T>, String, String, String, int) - Static method in class org.apache.spark.util.Utils
-
Format a sequence with semantics similar to calling .mkString().
- truncatedString(Seq<T>, String) - Static method in class org.apache.spark.util.Utils
-
Shorthand for calling truncatedString() without start or end strings.
- tryLog(Function0<T>) - Static method in class org.apache.spark.util.Utils
-
Executes the given block in a Try, logging any uncaught exceptions.
- tryLogNonFatalError(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Executes the given block.
- tryOrExit(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Execute a block of code that evaluates to Unit, forwarding any uncaught exceptions to the
default UncaughtExceptionHandler
- tryOrIOException(Function0<T>) - Static method in class org.apache.spark.util.Utils
-
Execute a block of code that returns a value, re-throwing any non-fatal uncaught
exceptions as IOException.
- tryOrStopSparkContext(SparkContext, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Execute a block of code that evaluates to Unit, stop SparkContext if there is any uncaught
exception
- tryRecoverFromCheckpoint(String) - Method in class org.apache.spark.streaming.StreamingContextPythonHelper
-
This is a private method only for Python to implement getOrCreate
.
- tryWithResource(Function0<R>, Function1<R, T>) - Static method in class org.apache.spark.util.Utils
-
- tryWithSafeFinally(Function0<T>, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Execute a block of code, then a finally block, but if exceptions happen in
the finally block, do not suppress the original exception.
- tryWithSafeFinallyAndFailureCallbacks(Function0<T>, Function0<BoxedUnit>, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Execute a block of code and call the failure callbacks in the catch block.
- tuple(Encoder<T1>, Encoder<T2>) - Static method in class org.apache.spark.sql.Encoders
-
An encoder for 2-ary tuples.
- tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>) - Static method in class org.apache.spark.sql.Encoders
-
An encoder for 3-ary tuples.
- tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>, Encoder<T4>) - Static method in class org.apache.spark.sql.Encoders
-
An encoder for 4-ary tuples.
- tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>, Encoder<T4>, Encoder<T5>) - Static method in class org.apache.spark.sql.Encoders
-
An encoder for 5-ary tuples.
- tValues() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary
-
T-statistic of estimated coefficients and intercept.
- tValues() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
-
T-statistic of estimated coefficients and intercept.
- Tweedie$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Tweedie$
-
- TYPE() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
-
- typed - Class in org.apache.spark.sql.expressions.javalang
-
Experimental
Type-safe functions available for
Dataset
operations in Java.
- typed() - Constructor for class org.apache.spark.sql.expressions.javalang.typed
-
- typed - Class in org.apache.spark.sql.expressions.scalalang
-
Experimental
Type-safe functions available for Dataset
operations in Scala.
- typed() - Constructor for class org.apache.spark.sql.expressions.scalalang.typed
-
- TypedColumn<T,U> - Class in org.apache.spark.sql
-
A
Column
where an
Encoder
has been given for the expected input and return type.
- TypedColumn(Expression, ExpressionEncoder<U>) - Constructor for class org.apache.spark.sql.TypedColumn
-
- typedLit(T, TypeTags.TypeTag<T>) - Static method in class org.apache.spark.sql.functions
-
Creates a
Column
of literal value.
- typeInfoConversions(DataType) - Constructor for class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
-
- typeInfoConversions(DataType) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
-
- typeName() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- typeName() - Static method in class org.apache.spark.sql.types.BinaryType
-
- typeName() - Static method in class org.apache.spark.sql.types.BooleanType
-
- typeName() - Static method in class org.apache.spark.sql.types.ByteType
-
- typeName() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
-
- typeName() - Method in class org.apache.spark.sql.types.DataType
-
Name of the type used in JSON serialization.
- typeName() - Static method in class org.apache.spark.sql.types.DateType
-
- typeName() - Method in class org.apache.spark.sql.types.DecimalType
-
- typeName() - Static method in class org.apache.spark.sql.types.DoubleType
-
- typeName() - Static method in class org.apache.spark.sql.types.FloatType
-
- typeName() - Static method in class org.apache.spark.sql.types.IntegerType
-
- typeName() - Static method in class org.apache.spark.sql.types.LongType
-
- typeName() - Static method in class org.apache.spark.sql.types.NullType
-
- typeName() - Static method in class org.apache.spark.sql.types.ShortType
-
- typeName() - Static method in class org.apache.spark.sql.types.StringType
-
- typeName() - Static method in class org.apache.spark.sql.types.TimestampType
-