azure.mgmt.datalake.analytics.job.models module

class azure.mgmt.datalake.analytics.job.models.JobStatisticsVertexStage[source]

Bases: msrest.serialization.Model

The Data Lake Analytics job statistics vertex stage information.

Variables are only populated by the server, and will be ignored when sending a request.

Variables:
  • data_read (long) – the amount of data read, in bytes.
  • data_read_cross_pod (long) – the amount of data read across multiple pods, in bytes.
  • data_read_intra_pod (long) – the amount of data read in one pod, in bytes.
  • data_to_read (long) – the amount of data remaining to be read, in bytes.
  • data_written (long) – the amount of data written, in bytes.
  • duplicate_discard_count (int) – the number of duplicates that were discarded.
  • failed_count (int) – the number of failures that occured in this stage.
  • max_vertex_data_read (long) – the maximum amount of data read in a single vertex, in bytes.
  • min_vertex_data_read (long) – the minimum amount of data read in a single vertex, in bytes.
  • read_failure_count (int) – the number of read failures in this stage.
  • revocation_count (int) – the number of vertices that were revoked during this stage.
  • running_count (int) – the number of currently running vertices in this stage.
  • scheduled_count (int) – the number of currently scheduled vertices in this stage
  • stage_name (str) – the name of this stage in job execution.
  • succeeded_count (int) – the number of vertices that succeeded in this stage.
  • temp_data_written (long) – the amount of temporary data written, in bytes.
  • total_count (int) – the total vertex count for this stage.
  • total_failed_time (timedelta) – the amount of time that failed vertices took up in this stage.
  • total_progress (int) – the current progress of this stage, as a percentage.
  • total_succeeded_time (timedelta) – the amount of time all successful vertices took in this stage.
class azure.mgmt.datalake.analytics.job.models.JobStatistics[source]

Bases: msrest.serialization.Model

The Data Lake Analytics job execution statistics.

Variables are only populated by the server, and will be ignored when sending a request.

Variables:
  • last_update_time_utc (datetime) – the last update time for the statistics.
  • finalizing_time_utc (datetime) – the job finalizing start time.
  • stages (list of JobStatisticsVertexStage) – the list of stages for the job.
class azure.mgmt.datalake.analytics.job.models.JobDataPath[source]

Bases: msrest.serialization.Model

A Data Lake Analytics job data path item.

Variables are only populated by the server, and will be ignored when sending a request.

Variables:
  • job_id (str) – the id of the job this data is for.
  • command (str) – the command that this job data relates to.
  • paths (list of str) – the list of paths to all of the job data.
class azure.mgmt.datalake.analytics.job.models.JobStateAuditRecord[source]

Bases: msrest.serialization.Model

The Data Lake Analytics job state audit records for tracking the lifecycle of a job.

Variables are only populated by the server, and will be ignored when sending a request.

Variables:
  • new_state (str) – the new state the job is in.
  • time_stamp (datetime) – the time stamp that the state change took place.
  • requested_by_user (str) – the user who requests the change.
  • details (str) – the details of the audit log.
class azure.mgmt.datalake.analytics.job.models.JobResource(name=None, resource_path=None, type=None)[source]

Bases: msrest.serialization.Model

The Data Lake Analytics job resources.

Parameters:
  • name (str) – the name of the resource.
  • resource_path (str) – the path to the resource.
  • type (str or JobResourceType) – the job resource type. Possible values include: ‘VertexResource’, ‘JobManagerResource’, ‘StatisticsResource’, ‘VertexResourceInUserFolder’, ‘JobManagerResourceInUserFolder’, ‘StatisticsResourceInUserFolder’
class azure.mgmt.datalake.analytics.job.models.Diagnostics[source]

Bases: msrest.serialization.Model

Error diagnostic information for failed jobs.

Variables are only populated by the server, and will be ignored when sending a request.

Variables:
  • column_number (int) – the column where the error occured.
  • end (int) – the ending index of the error.
  • line_number (int) – the line number the error occured on.
  • message (str) – the error message.
  • severity (str or SeverityTypes) – the severity of the error. Possible values include: ‘Warning’, ‘Error’, ‘Info’, ‘SevereWarning’, ‘Deprecated’, ‘UserWarning’
  • start (int) – the starting index of the error.
class azure.mgmt.datalake.analytics.job.models.USqlJobProperties(script, runtime_version=None, resources=None, statistics=None, debug_data=None, diagnostics=None, compile_mode=None)[source]

Bases: azure.mgmt.datalake.analytics.job.models.job_properties.JobProperties

U-SQL job properties used when submitting and retrieving U-SQL jobs.

Variables are only populated by the server, and will be ignored when sending a request.

Parameters:
  • runtime_version (str) – the runtime version of the Data Lake Analytics engine to use for the specific type of job being run.
  • script (str) – the script to run
  • type (str) – Polymorphic Discriminator
  • resources (list of JobResource) – the list of resources that are required by the job
  • statistics (JobStatistics) – the job specific statistics.
  • debug_data (JobDataPath) – the job specific debug data locations.
  • diagnostics (list of Diagnostics) – the diagnostics for the job.
  • compile_mode (str or CompileMode) – Optionally enforces a specific compilation mode for the job during execution. If this is not specified during submission, the server will determine the optimal compilation mode. Possible values include: ‘Semantic’, ‘Full’, ‘SingleBox’
Variables:
  • algebra_file_path (str) – the algebra file path after the job has completed
  • total_compilation_time (timedelta) – the total time this job spent compiling. This value should not be set by the user and will be ignored if it is.
  • total_pause_time (timedelta) – the total time this job spent paused. This value should not be set by the user and will be ignored if it is.
  • total_queued_time (timedelta) – the total time this job spent queued. This value should not be set by the user and will be ignored if it is.
  • total_running_time (timedelta) – the total time this job spent executing. This value should not be set by the user and will be ignored if it is.
  • root_process_node_id (str) – the ID used to identify the job manager coordinating job execution. This value should not be set by the user and will be ignored if it is.
  • yarn_application_id (str) – the ID used to identify the yarn application executing the job. This value should not be set by the user and will be ignored if it is.
  • yarn_application_time_stamp (long) – the timestamp (in ticks) for the yarn application executing the job. This value should not be set by the user and will be ignored if it is.
class azure.mgmt.datalake.analytics.job.models.HiveJobProperties(script, runtime_version=None)[source]

Bases: azure.mgmt.datalake.analytics.job.models.job_properties.JobProperties

Hive job properties used when submitting and retrieving Hive jobs.

Variables are only populated by the server, and will be ignored when sending a request.

Parameters:
  • runtime_version (str) – the runtime version of the Data Lake Analytics engine to use for the specific type of job being run.
  • script (str) – the script to run
  • type (str) – Polymorphic Discriminator
Variables:
  • logs_location (str) – the Hive logs location
  • output_location (str) – the location of Hive job output files (both execution output and results)
  • statement_count (int) – the number of statements that will be run based on the script
  • executed_statement_count (int) – the number of statements that have been run based on the script
class azure.mgmt.datalake.analytics.job.models.JobProperties(script, runtime_version=None)[source]

Bases: msrest.serialization.Model

The common Data Lake Analytics job properties.

Parameters:
  • runtime_version (str) – the runtime version of the Data Lake Analytics engine to use for the specific type of job being run.
  • script (str) – the script to run
  • type (str) – Polymorphic Discriminator
class azure.mgmt.datalake.analytics.job.models.JobInnerError[source]

Bases: msrest.serialization.Model

The Data Lake Analytics job error details.

Variables are only populated by the server, and will be ignored when sending a request.

Variables:
  • diagnostic_code (int) – the diagnostic error code.
  • severity (str or SeverityTypes) – the severity level of the failure. Possible values include: ‘Warning’, ‘Error’, ‘Info’, ‘SevereWarning’, ‘Deprecated’, ‘UserWarning’
  • details (str) – the details of the error message.
  • component (str) – the component that failed.
  • error_id (str) – the specific identifier for the type of error encountered in the job.
  • help_link (str) – the link to MSDN or Azure help for this type of error, if any.
  • internal_diagnostics (str) – the internal diagnostic stack trace if the user requesting the job error details has sufficient permissions it will be retrieved, otherwise it will be empty.
  • message (str) – the user friendly error message for the failure.
  • resolution (str) – the recommended resolution for the failure, if any.
  • source (str) – the ultimate source of the failure (usually either SYSTEM or USER).
  • description (str) – the error message description
class azure.mgmt.datalake.analytics.job.models.JobErrorDetails[source]

Bases: msrest.serialization.Model

The Data Lake Analytics job error details.

Variables are only populated by the server, and will be ignored when sending a request.

Variables:
  • description (str) – the error message description
  • details (str) – the details of the error message.
  • end_offset (int) – the end offset in the job where the error was found.
  • error_id (str) – the specific identifier for the type of error encountered in the job.
  • file_path (str) – the path to any supplemental error files, if any.
  • help_link (str) – the link to MSDN or Azure help for this type of error, if any.
  • internal_diagnostics (str) – the internal diagnostic stack trace if the user requesting the job error details has sufficient permissions it will be retrieved, otherwise it will be empty.
  • line_number (int) – the specific line number in the job where the error occured.
  • message (str) – the user friendly error message for the failure.
  • resolution (str) – the recommended resolution for the failure, if any.
  • inner_error (JobInnerError) – the inner error of this specific job error message, if any.
  • severity (str or SeverityTypes) – the severity level of the failure. Possible values include: ‘Warning’, ‘Error’, ‘Info’, ‘SevereWarning’, ‘Deprecated’, ‘UserWarning’
  • source (str) – the ultimate source of the failure (usually either SYSTEM or USER).
  • start_offset (int) – the start offset in the job where the error was found
class azure.mgmt.datalake.analytics.job.models.JobInformation(name, type, properties, degree_of_parallelism=1, priority=None, log_file_patterns=None)[source]

Bases: msrest.serialization.Model

The common Data Lake Analytics job information properties.

Variables are only populated by the server, and will be ignored when sending a request.

Variables:
  • job_id (str) – the job’s unique identifier (a GUID).
  • submitter (str) – the user or account that submitted the job.
  • error_message (list of JobErrorDetails) – the error message details for the job, if the job failed.
  • submit_time (datetime) – the time the job was submitted to the service.
  • start_time (datetime) – the start time of the job.
  • end_time (datetime) – the completion time of the job.
  • state (str or JobState) – the job state. When the job is in the Ended state, refer to Result and ErrorMessage for details. Possible values include: ‘Accepted’, ‘Compiling’, ‘Ended’, ‘New’, ‘Queued’, ‘Running’, ‘Scheduling’, ‘Starting’, ‘Paused’, ‘WaitingForCapacity’
  • result (str or JobResult) – the result of job execution or the current result of the running job. Possible values include: ‘None’, ‘Succeeded’, ‘Cancelled’, ‘Failed’
  • log_folder (str) – the log folder path to use in the following format: adl://<accountName>.azuredatalakestore.net/system/jobservice/jobs/Usql/2016/03/13/17/18/5fe51957-93bc-4de0-8ddc-c5a4753b068b/logs/.
  • state_audit_records (list of JobStateAuditRecord) – the job state audit records, indicating when various operations have been performed on this job.
Parameters:
  • name (str) – the friendly name of the job.
  • type (str or JobType) – the job type of the current job (Hive or USql). Possible values include: ‘USql’, ‘Hive’
  • degree_of_parallelism (int) – the degree of parallelism used for this job. This must be greater than 0, if set to less than 0 it will default to 1. Default value: 1 .
  • priority (int) – the priority value for the current job. Lower numbers have a higher priority. By default, a job has a priority of 1000. This must be greater than 0.
  • log_file_patterns (list of str) – the list of log file name patterns to find in the logFolder. ‘*’ is the only matching character allowed. Example format: jobExecution*.log or mylog.txt
  • properties (JobProperties) – the job specific properties.
class azure.mgmt.datalake.analytics.job.models.JobInformationPaged(*args, **kwargs)[source]

Bases: msrest.paging.Paged

A paging container for iterating over a list of JobInformation object

class azure.mgmt.datalake.analytics.job.models.JobResourceType[source]

Bases: enum.Enum

An enumeration.

job_manager_resource = 'JobManagerResource'
job_manager_resource_in_user_folder = 'JobManagerResourceInUserFolder'
statistics_resource = 'StatisticsResource'
statistics_resource_in_user_folder = 'StatisticsResourceInUserFolder'
vertex_resource = 'VertexResource'
vertex_resource_in_user_folder = 'VertexResourceInUserFolder'
class azure.mgmt.datalake.analytics.job.models.SeverityTypes[source]

Bases: enum.Enum

An enumeration.

deprecated = 'Deprecated'
error = 'Error'
info = 'Info'
severe_warning = 'SevereWarning'
user_warning = 'UserWarning'
warning = 'Warning'
class azure.mgmt.datalake.analytics.job.models.CompileMode[source]

Bases: enum.Enum

An enumeration.

full = 'Full'
semantic = 'Semantic'
single_box = 'SingleBox'
class azure.mgmt.datalake.analytics.job.models.JobType[source]

Bases: enum.Enum

An enumeration.

hive = 'Hive'
usql = 'USql'
class azure.mgmt.datalake.analytics.job.models.JobState[source]

Bases: enum.Enum

An enumeration.

accepted = 'Accepted'
compiling = 'Compiling'
ended = 'Ended'
new = 'New'
paused = 'Paused'
queued = 'Queued'
running = 'Running'
scheduling = 'Scheduling'
starting = 'Starting'
waiting_for_capacity = 'WaitingForCapacity'
class azure.mgmt.datalake.analytics.job.models.JobResult[source]

Bases: enum.Enum

An enumeration.

cancelled = 'Cancelled'
failed = 'Failed'
none = 'None'
succeeded = 'Succeeded'