Databricks Naming Compliance Review Prompt¶
Use this prompt as the default naming standard and review procedure before inspecting the target repository. No source PDF is required. If the user explicitly supplies a newer naming standard, treat it as an override or addendum and call out any differences from the embedded rules below.
Inputs¶
repo_path: target Databricks ETL repositoryreview_ref: branch, tag, or commit to review; default tomainreviewer_name: user name for the metadata blockprevious_artifact_hint: optional path or title for the prior reviewoutput_preference: canvas-like artifact when available, or Markdown report
Metadata and Previous Run¶
At the top of the output, include:
Last updated by: <reviewer_name>
Run date: <current local date/time with timezone>
Reviewed ref: <review_ref>
Reviewed head commit: <full sha>
Reviewed head commit date: <commit date/time with timezone>
Delay since last run: <duration or "First recorded run">
Previous run source: <path/title or "none found">
Find the previous run before writing the new output:
- Search for prior naming-compliance artifacts in the requested output location, repo docs, canvas directory, or user-provided hint.
- Prefer artifacts with a matching title and repo/ref metadata.
- Parse
Run dateorLast updatedif present. - Calculate elapsed time between the prior run date and current run date in
human terms:
N days,N hours, orN days N hours. - If the prior artifact lacks a parseable date, report
Previous run found, date not parseable.
Output Capability Question¶
If the user has not already specified an output mode, ask:
Should I create a canvas-like visual artifact if this IDE or agent supports one,
and otherwise generate a regular Markdown (.md) report?
Respect that answer for the run. Different IDEs and agents have different
artifact capabilities, so never assume a canvas API exists. If a canvas-like
artifact is unavailable, generate a regular .md report with the same metadata,
score, evidence, and findings.
Embedded Naming Standard¶
Use these rules unless the user explicitly provides an updated standard:
- Databricks folders use
Title_Snake_Case: each word begins with a capital and words are separated by underscores, for exampleMilk_Supply_ForecastandDatabricks_Jobs. - Code files, including notebooks, SQL queries, alerts, and job YAML, use pure lower snake_case: all words lower case and separated by underscores.
- Code file names should be descriptive and loosely follow
{actiontype}_{object}_{target_type}_{extrainfo}.{type}. - Bitbucket branches use
{type}/{initials}/{JIRA-TICKET}-{PascalCaseDescription}:typeandinitialsare lower case,JIRA-TICKETis upper case with a hyphenated ticket number, and the description is PascalCase. - Catalog objects, users/groups, and compute are delegated to other standards. Do not invent those rules; report assessability gaps. Still report observable drift such as mixed-case object names or hard-coded compute identifiers.
Inspection Scope¶
Use read-only inspection unless the user asks for remediation.
Review at least:
- tracked folder segments on
review_ref - code files with Databricks-relevant extensions:
.py,.sql,.yml,.yaml,.dbalert.json,.dbquery.ipynb - Databricks job YAML filenames
- job
name:values andtask_keyvalues notebook_pathentries from job YAML- catalog/schema/table/view references in SQL and Spark code
- hard-coded production catalogs, cluster IDs, warehouse IDs, and other compute references that may need separate-standard review
- local and remote branch refs if branch naming is in scope
Recommended strict classifiers:
- Folder compliant:
^[A-Z][A-Za-z0-9]*(?:_[A-Z][A-Za-z0-9]*)*$ - Code file compliant:
^[a-z0-9]+(?:_[a-z0-9]+)*$before the type suffix - Branch compliant:
^(bug|feature|discovery|hotfix|chore|docs|test|refactor)/[a-z]+/[A-Z][A-Z0-9]+-[0-9]+-[A-Z][A-Za-z0-9]*$
State any deviations from these classifiers, especially for acronyms, compound extensions, generated files, templates, archives, or exploratory work.
Evidence to Collect¶
Return counts and percentages for:
- total tracked files reviewed
- unique folder segments reviewed
- compliant and non-compliant folder segments
- code files reviewed
- compliant and non-compliant code files
- Databricks job YAML files reviewed
- compliant and non-compliant job YAML filenames
- non-compliant job names and task keys
- mixed-case or otherwise suspect catalog object references
- hard-coded compute references
- branch refs reviewed and matching the branch convention
Weighted Roll-Up Score¶
Calculate a weighted score out of 100 from the rule-coverage percentages:
score =
(databricks_object_reference_compliance_pct * 0.50)
+ (code_file_compliance_pct * 0.25)
+ (folder_compliance_pct * 0.15)
+ (job_yaml_file_compliance_pct * 0.10)
Show the score to one decimal place and, when useful for readability, a rounded
whole-number score such as 92 / 100.
Include a compact scoring breakdown with:
- rule area
- compliant count
- compliance percentage
- weight
- score contribution
Briefly explain the weighting: Databricks object references are weighted at 50% because they are persistent runtime/data contracts and are usually harder to rename once consumed downstream. Code files are weighted at 25% because they are the largest day-to-day maintainability surface. Folders are weighted at 15% because they shape navigation and workspace consistency. Job YAML filenames get the remaining 10% because they affect deployment clarity but are typically a smaller, easier-to-remediate surface.
For findings, include:
- representative compliant examples
- representative violations with paths
- top hotspots by project/module
- anti-pattern categories such as uppercase stems, hyphens, spaces, extra dots, lower-case folder segments, mixed-case object names, hard-coded production catalogs, and hard-coded compute IDs
- ambiguity or standards gaps
Report Shape¶
Use this structure:
- Metadata block
- Executive summary
- Standards interpreted
- Weighted roll-up score and scoring explanation
- Compliance dashboard with counts and percentages
- Findings by severity
- Hotspots by project/module
- Representative compliant examples
- Representative violations and anti-patterns
- Progress or delay since last run
- Recommended remediation order
- Assumptions and standards gaps
If creating a visual artifact, keep the same information architecture and make the metadata block visible at the top. If no visual artifact capability is available, write a regular Markdown report instead.