/** * Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. * SPDX-License-Identifier: Apache-2.0. */ #pragma once #include #include #include #include #include #include namespace Aws { namespace Utils { namespace Json { class JsonValue; class JsonView; } // namespace Json } // namespace Utils namespace SageMaker { namespace Model { /** *

Describes the S3 data source.

A key * name prefix might look like this: s3://bucketname/exampleprefix *
A manifest might look like this: * s3://bucketname/example.manifest

A manifest is an S3 * object which is a JSON file consisting of an array of elements. The first * element is a prefix which is followed by one or more suffixes. SageMaker appends * the suffix elements to the prefix to get a full set of S3Uri. Note * that the prefix must be a valid non-empty S3Uri that precludes * users from specifying a manifest whose individual S3Uri is sourced * from different S3 buckets.

The following code example shows a valid * manifest format:

[ {"prefix": * "s3://customer_bucket/some/prefix/"},

* "relative/path/to/custdata-1",

* "relative/path/custdata-2",

...

* "relative/path/custdata-N"

]

This JSON is * equivalent to the following S3Uri list:

* s3://customer_bucket/some/prefix/relative/path/to/custdata-1
*
s3://customer_bucket/some/prefix/relative/path/custdata-2
*
...

* s3://customer_bucket/some/prefix/relative/path/custdata-N
*
The complete set of S3Uri in this manifest is the input data for * the channel for this data source. The object that each S3Uri points * to must be readable by the IAM role that Amazon SageMaker uses to perform tasks * on your behalf.

*/ inline const Aws::String& GetS3Uri() const{ return m_s3Uri; } /** *

Depending on the value specified for the S3DataType, identifies * either a key name prefix or a manifest. For example:

A key * name prefix might look like this: s3://bucketname/exampleprefix *
A manifest might look like this: * s3://bucketname/example.manifest

A manifest is an S3 * object which is a JSON file consisting of an array of elements. The first * element is a prefix which is followed by one or more suffixes. SageMaker appends * the suffix elements to the prefix to get a full set of S3Uri. Note * that the prefix must be a valid non-empty S3Uri that precludes * users from specifying a manifest whose individual S3Uri is sourced * from different S3 buckets.

The following code example shows a valid * manifest format:

[ {"prefix": * "s3://customer_bucket/some/prefix/"},

* "relative/path/to/custdata-1",

* "relative/path/custdata-2",

...

* "relative/path/custdata-N"

]

This JSON is * equivalent to the following S3Uri list:

* s3://customer_bucket/some/prefix/relative/path/to/custdata-1
*
s3://customer_bucket/some/prefix/relative/path/custdata-2
*
...

* s3://customer_bucket/some/prefix/relative/path/custdata-N
*
The complete set of S3Uri in this manifest is the input data for * the channel for this data source. The object that each S3Uri points * to must be readable by the IAM role that Amazon SageMaker uses to perform tasks * on your behalf.

*/ inline bool S3UriHasBeenSet() const { return m_s3UriHasBeenSet; } /** *

Depending on the value specified for the S3DataType, identifies * either a key name prefix or a manifest. For example:

A key * name prefix might look like this: s3://bucketname/exampleprefix *
A manifest might look like this: * s3://bucketname/example.manifest

A manifest is an S3 * object which is a JSON file consisting of an array of elements. The first * element is a prefix which is followed by one or more suffixes. SageMaker appends * the suffix elements to the prefix to get a full set of S3Uri. Note * that the prefix must be a valid non-empty S3Uri that precludes * users from specifying a manifest whose individual S3Uri is sourced * from different S3 buckets.

The following code example shows a valid * manifest format:

[ {"prefix": * "s3://customer_bucket/some/prefix/"},

* "relative/path/to/custdata-1",

* "relative/path/custdata-2",

...

* "relative/path/custdata-N"

]

This JSON is * equivalent to the following S3Uri list:

* s3://customer_bucket/some/prefix/relative/path/to/custdata-1
*
s3://customer_bucket/some/prefix/relative/path/custdata-2
*
...

* s3://customer_bucket/some/prefix/relative/path/custdata-N
*
The complete set of S3Uri in this manifest is the input data for * the channel for this data source. The object that each S3Uri points * to must be readable by the IAM role that Amazon SageMaker uses to perform tasks * on your behalf.

*/ inline void SetS3Uri(const Aws::String& value) { m_s3UriHasBeenSet = true; m_s3Uri = value; } /** *

Depending on the value specified for the S3DataType, identifies * either a key name prefix or a manifest. For example:

A key * name prefix might look like this: s3://bucketname/exampleprefix *
A manifest might look like this: * s3://bucketname/example.manifest

A manifest is an S3 * object which is a JSON file consisting of an array of elements. The first * element is a prefix which is followed by one or more suffixes. SageMaker appends * the suffix elements to the prefix to get a full set of S3Uri. Note * that the prefix must be a valid non-empty S3Uri that precludes * users from specifying a manifest whose individual S3Uri is sourced * from different S3 buckets.

The following code example shows a valid * manifest format:

[ {"prefix": * "s3://customer_bucket/some/prefix/"},

* "relative/path/to/custdata-1",

* "relative/path/custdata-2",

...

* "relative/path/custdata-N"

]

This JSON is * equivalent to the following S3Uri list:

* s3://customer_bucket/some/prefix/relative/path/to/custdata-1
*
s3://customer_bucket/some/prefix/relative/path/custdata-2
*
...

* s3://customer_bucket/some/prefix/relative/path/custdata-N
*
The complete set of S3Uri in this manifest is the input data for * the channel for this data source. The object that each S3Uri points * to must be readable by the IAM role that Amazon SageMaker uses to perform tasks * on your behalf.

*/ inline void SetS3Uri(Aws::String&& value) { m_s3UriHasBeenSet = true; m_s3Uri = std::move(value); } /** *

Depending on the value specified for the S3DataType, identifies * either a key name prefix or a manifest. For example:

A key * name prefix might look like this: s3://bucketname/exampleprefix *
A manifest might look like this: * s3://bucketname/example.manifest

A manifest is an S3 * object which is a JSON file consisting of an array of elements. The first * element is a prefix which is followed by one or more suffixes. SageMaker appends * the suffix elements to the prefix to get a full set of S3Uri. Note * that the prefix must be a valid non-empty S3Uri that precludes * users from specifying a manifest whose individual S3Uri is sourced * from different S3 buckets.

The following code example shows a valid * manifest format:

[ {"prefix": * "s3://customer_bucket/some/prefix/"},

* "relative/path/to/custdata-1",

* "relative/path/custdata-2",

...

* "relative/path/custdata-N"

]

This JSON is * equivalent to the following S3Uri list:

* s3://customer_bucket/some/prefix/relative/path/to/custdata-1
*
s3://customer_bucket/some/prefix/relative/path/custdata-2
*
...

* s3://customer_bucket/some/prefix/relative/path/custdata-N
*
The complete set of S3Uri in this manifest is the input data for * the channel for this data source. The object that each S3Uri points * to must be readable by the IAM role that Amazon SageMaker uses to perform tasks * on your behalf.

*/ inline void SetS3Uri(const char* value) { m_s3UriHasBeenSet = true; m_s3Uri.assign(value); } /** *

Depending on the value specified for the S3DataType, identifies * either a key name prefix or a manifest. For example:

A key * name prefix might look like this: s3://bucketname/exampleprefix *
A manifest might look like this: * s3://bucketname/example.manifest

A manifest is an S3 * object which is a JSON file consisting of an array of elements. The first * element is a prefix which is followed by one or more suffixes. SageMaker appends * the suffix elements to the prefix to get a full set of S3Uri. Note * that the prefix must be a valid non-empty S3Uri that precludes * users from specifying a manifest whose individual S3Uri is sourced * from different S3 buckets.

The following code example shows a valid * manifest format:

[ {"prefix": * "s3://customer_bucket/some/prefix/"},

* "relative/path/to/custdata-1",

* "relative/path/custdata-2",

...

* "relative/path/custdata-N"

]

This JSON is * equivalent to the following S3Uri list:

* s3://customer_bucket/some/prefix/relative/path/to/custdata-1
*
s3://customer_bucket/some/prefix/relative/path/custdata-2
*
...

* s3://customer_bucket/some/prefix/relative/path/custdata-N
*
The complete set of S3Uri in this manifest is the input data for * the channel for this data source. The object that each S3Uri points * to must be readable by the IAM role that Amazon SageMaker uses to perform tasks * on your behalf.

*/ inline S3DataSource& WithS3Uri(const Aws::String& value) { SetS3Uri(value); return *this;} /** *

Depending on the value specified for the S3DataType, identifies * either a key name prefix or a manifest. For example:

A key * name prefix might look like this: s3://bucketname/exampleprefix *
A manifest might look like this: * s3://bucketname/example.manifest

A manifest is an S3 * object which is a JSON file consisting of an array of elements. The first * element is a prefix which is followed by one or more suffixes. SageMaker appends * the suffix elements to the prefix to get a full set of S3Uri. Note * that the prefix must be a valid non-empty S3Uri that precludes * users from specifying a manifest whose individual S3Uri is sourced * from different S3 buckets.

The following code example shows a valid * manifest format:

[ {"prefix": * "s3://customer_bucket/some/prefix/"},

* "relative/path/to/custdata-1",

* "relative/path/custdata-2",

...

* "relative/path/custdata-N"

]

This JSON is * equivalent to the following S3Uri list:

* s3://customer_bucket/some/prefix/relative/path/to/custdata-1
*
s3://customer_bucket/some/prefix/relative/path/custdata-2
*
...

* s3://customer_bucket/some/prefix/relative/path/custdata-N
*
The complete set of S3Uri in this manifest is the input data for * the channel for this data source. The object that each S3Uri points * to must be readable by the IAM role that Amazon SageMaker uses to perform tasks * on your behalf.

*/ inline S3DataSource& WithS3Uri(Aws::String&& value) { SetS3Uri(std::move(value)); return *this;} /** *

Depending on the value specified for the S3DataType, identifies * either a key name prefix or a manifest. For example:

A key * name prefix might look like this: s3://bucketname/exampleprefix *
A manifest might look like this: * s3://bucketname/example.manifest

A manifest is an S3 * object which is a JSON file consisting of an array of elements. The first * element is a prefix which is followed by one or more suffixes. SageMaker appends * the suffix elements to the prefix to get a full set of S3Uri. Note * that the prefix must be a valid non-empty S3Uri that precludes * users from specifying a manifest whose individual S3Uri is sourced * from different S3 buckets.

The following code example shows a valid * manifest format:

[ {"prefix": * "s3://customer_bucket/some/prefix/"},

* "relative/path/to/custdata-1",

* "relative/path/custdata-2",

...

* "relative/path/custdata-N"

]

This JSON is * equivalent to the following S3Uri list:

* s3://customer_bucket/some/prefix/relative/path/to/custdata-1
*
s3://customer_bucket/some/prefix/relative/path/custdata-2
*
...

* s3://customer_bucket/some/prefix/relative/path/custdata-N
*
The complete set of S3Uri in this manifest is the input data for * the channel for this data source. The object that each S3Uri points * to must be readable by the IAM role that Amazon SageMaker uses to perform tasks * on your behalf.

*/ inline S3DataSource& WithS3Uri(const char* value) { SetS3Uri(value); return *this;} /** *

If you want Amazon SageMaker to replicate the entire dataset on each ML * compute instance that is launched for model training, specify * FullyReplicated.

If you want Amazon SageMaker to replicate * a subset of data on each ML compute instance that is launched for model * training, specify ShardedByS3Key. If there are n ML compute * instances launched for a training job, each instance gets approximately * 1/n of the number of S3 objects. In this case, model training on each * machine uses only the subset of training data.

Don't choose more ML * compute instances for training than available S3 objects. If you do, some nodes * won't get any data and you will pay for nodes that aren't getting any training * data. This applies in both File and Pipe modes. Keep this in mind when * developing algorithms.

In distributed training, where you use multiple * ML compute EC2 instances, you might choose ShardedByS3Key. If the * algorithm requires copying training data to the ML storage volume (when * TrainingInputMode is set to File), this copies * 1/n of the number of objects.

*/ inline const S3DataDistribution& GetS3DataDistributionType() const{ return m_s3DataDistributionType; } /** *

If you want Amazon SageMaker to replicate the entire dataset on each ML * compute instance that is launched for model training, specify * FullyReplicated.

*/ inline bool S3DataDistributionTypeHasBeenSet() const { return m_s3DataDistributionTypeHasBeenSet; } /** *

If you want Amazon SageMaker to replicate the entire dataset on each ML * compute instance that is launched for model training, specify * FullyReplicated.

*/ inline void SetS3DataDistributionType(const S3DataDistribution& value) { m_s3DataDistributionTypeHasBeenSet = true; m_s3DataDistributionType = value; } /** *

If you want Amazon SageMaker to replicate the entire dataset on each ML * compute instance that is launched for model training, specify * FullyReplicated.

*/ inline void SetS3DataDistributionType(S3DataDistribution&& value) { m_s3DataDistributionTypeHasBeenSet = true; m_s3DataDistributionType = std::move(value); } /** *

If you want Amazon SageMaker to replicate the entire dataset on each ML * compute instance that is launched for model training, specify * FullyReplicated.

*/ inline S3DataSource& WithS3DataDistributionType(const S3DataDistribution& value) { SetS3DataDistributionType(value); return *this;} /** *

If you want Amazon SageMaker to replicate the entire dataset on each ML * compute instance that is launched for model training, specify * FullyReplicated.

*/ inline S3DataSource& WithS3DataDistributionType(S3DataDistribution&& value) { SetS3DataDistributionType(std::move(value)); return *this;} /** *