Skip to content

Databricks Naming Compliance Review Prompt

Use this prompt as the default naming standard and review procedure before inspecting the target repository. No source PDF is required. If the user explicitly supplies a newer naming standard, treat it as an override or addendum and call out any differences from the embedded rules below.

Inputs

  • repo_path: target Databricks ETL repository
  • review_ref: branch, tag, or commit to review; default to main
  • reviewer_name: user name for the metadata block
  • previous_artifact_hint: optional path or title for the prior review
  • output_preference: canvas-like artifact when available, or Markdown report

Metadata and Previous Run

At the top of the output, include:

Last updated by: <reviewer_name>
Run date: <current local date/time with timezone>
Reviewed ref: <review_ref>
Reviewed head commit: <full sha>
Reviewed head commit date: <commit date/time with timezone>
Delay since last run: <duration or "First recorded run">
Previous run source: <path/title or "none found">

Find the previous run before writing the new output:

  1. Search for prior naming-compliance artifacts in the requested output location, repo docs, canvas directory, or user-provided hint.
  2. Prefer artifacts with a matching title and repo/ref metadata.
  3. Parse Run date or Last updated if present.
  4. Calculate elapsed time between the prior run date and current run date in human terms: N days, N hours, or N days N hours.
  5. If the prior artifact lacks a parseable date, report Previous run found, date not parseable.

Output Capability Question

If the user has not already specified an output mode, ask:

Should I create a canvas-like visual artifact if this IDE or agent supports one,
and otherwise generate a regular Markdown (.md) report?

Respect that answer for the run. Different IDEs and agents have different artifact capabilities, so never assume a canvas API exists. If a canvas-like artifact is unavailable, generate a regular .md report with the same metadata, score, evidence, and findings.

Embedded Naming Standard

Use these rules unless the user explicitly provides an updated standard:

  • Databricks folders use Title_Snake_Case: each word begins with a capital and words are separated by underscores, for example Milk_Supply_Forecast and Databricks_Jobs.
  • Code files, including notebooks, SQL queries, alerts, and job YAML, use pure lower snake_case: all words lower case and separated by underscores.
  • Code file names should be descriptive and loosely follow {actiontype}_{object}_{target_type}_{extrainfo}.{type}.
  • Bitbucket branches use {type}/{initials}/{JIRA-TICKET}-{PascalCaseDescription}: type and initials are lower case, JIRA-TICKET is upper case with a hyphenated ticket number, and the description is PascalCase.
  • Catalog objects, users/groups, and compute are delegated to other standards. Do not invent those rules; report assessability gaps. Still report observable drift such as mixed-case object names or hard-coded compute identifiers.

Inspection Scope

Use read-only inspection unless the user asks for remediation.

Review at least:

  • tracked folder segments on review_ref
  • code files with Databricks-relevant extensions: .py, .sql, .yml, .yaml, .dbalert.json, .dbquery.ipynb
  • Databricks job YAML filenames
  • job name: values and task_key values
  • notebook_path entries from job YAML
  • catalog/schema/table/view references in SQL and Spark code
  • hard-coded production catalogs, cluster IDs, warehouse IDs, and other compute references that may need separate-standard review
  • local and remote branch refs if branch naming is in scope

Recommended strict classifiers:

  • Folder compliant: ^[A-Z][A-Za-z0-9]*(?:_[A-Z][A-Za-z0-9]*)*$
  • Code file compliant: ^[a-z0-9]+(?:_[a-z0-9]+)*$ before the type suffix
  • Branch compliant: ^(bug|feature|discovery|hotfix|chore|docs|test|refactor)/[a-z]+/[A-Z][A-Z0-9]+-[0-9]+-[A-Z][A-Za-z0-9]*$

State any deviations from these classifiers, especially for acronyms, compound extensions, generated files, templates, archives, or exploratory work.

Evidence to Collect

Return counts and percentages for:

  • total tracked files reviewed
  • unique folder segments reviewed
  • compliant and non-compliant folder segments
  • code files reviewed
  • compliant and non-compliant code files
  • Databricks job YAML files reviewed
  • compliant and non-compliant job YAML filenames
  • non-compliant job names and task keys
  • mixed-case or otherwise suspect catalog object references
  • hard-coded compute references
  • branch refs reviewed and matching the branch convention

Weighted Roll-Up Score

Calculate a weighted score out of 100 from the rule-coverage percentages:

score =
  (databricks_object_reference_compliance_pct * 0.50)
  + (code_file_compliance_pct * 0.25)
  + (folder_compliance_pct * 0.15)
  + (job_yaml_file_compliance_pct * 0.10)

Show the score to one decimal place and, when useful for readability, a rounded whole-number score such as 92 / 100.

Include a compact scoring breakdown with:

  • rule area
  • compliant count
  • compliance percentage
  • weight
  • score contribution

Briefly explain the weighting: Databricks object references are weighted at 50% because they are persistent runtime/data contracts and are usually harder to rename once consumed downstream. Code files are weighted at 25% because they are the largest day-to-day maintainability surface. Folders are weighted at 15% because they shape navigation and workspace consistency. Job YAML filenames get the remaining 10% because they affect deployment clarity but are typically a smaller, easier-to-remediate surface.

For findings, include:

  • representative compliant examples
  • representative violations with paths
  • top hotspots by project/module
  • anti-pattern categories such as uppercase stems, hyphens, spaces, extra dots, lower-case folder segments, mixed-case object names, hard-coded production catalogs, and hard-coded compute IDs
  • ambiguity or standards gaps

Report Shape

Use this structure:

  1. Metadata block
  2. Executive summary
  3. Standards interpreted
  4. Weighted roll-up score and scoring explanation
  5. Compliance dashboard with counts and percentages
  6. Findings by severity
  7. Hotspots by project/module
  8. Representative compliant examples
  9. Representative violations and anti-patterns
  10. Progress or delay since last run
  11. Recommended remediation order
  12. Assumptions and standards gaps

If creating a visual artifact, keep the same information architecture and make the metadata block visible at the top. If no visual artifact capability is available, write a regular Markdown report instead.