Skip to content

How AI Generates Unit Tests for PCF Controls: Patterns and Calibration

What AI-assisted test generation gets right, what it stumbles on, and how to calibrate the workflow for PCF controls. Notes from building Field Audit History.

Victoria Pechenizka 8 min read

Most PCF controls ship with no tests. Open the public catalog of community PCFs on GitHub. Almost every repo has a package.json, a pcfproj, a manifest, and zero test files. The few that have tests usually have a render-without-crashing check and nothing else.

That is not a moral failing. PCF development already fights you on five fronts: the manifest XML, the Dataverse sandbox, packaging, solution import, and the runtime context shape. By the time the control works on a form, writing tests feels like paperwork on a feature that already shipped.

I felt the same way about Field Audit History. It worked. Three major versions, an MIT repo, real users. Why test it?

Because “it works on my form” and “it handles every edge case” are different statements. The repo today has 18 test files in pcf/FieldAuditHistory/__tests__/ and the README claims 90+ test cases. Most of that scaffolding came from AI-assisted generation, with manual review and rewrites on top. This article is the calibration notes from that process: where AI carries its weight, where it does not, and how to set the workflow up so the output is actually useful.

The Setup That Works

The setup matters more than the prompt. Feed an AI a single component file with no surrounding context and it will hallucinate types, mocks, and runtime APIs. Feed it the right context and the output tightens fast.

What “right context” means for a PCF control:

  1. The component source itself. Every .tsx file the test will exercise.
  2. The type definitions. Including any IConfig, IAuditEntry, or props interfaces. Without these the AI invents shapes that look plausible and parse but do not match runtime.
  3. A representative data sample. A real fragment of what the Dataverse audit entity returns, or whatever your control consumes. Two or three records is enough.
  4. A working test harness. One existing passing test, even if trivial, anchors the output to your project’s actual mock patterns and import paths.
  5. The testing libraries you have already chosen. Jest, React Testing Library, jsdom, and any custom helpers you use for the PCF context object.

A prompt like “write Jest tests for this PCF using @testing-library/react, cover null handling, error states, and async loading, follow the patterns in the existing test file I attached” produces meaningfully better output than “write some tests.”

What AI Tends to Get Right

After using this loop on the FieldAuditHistory test suite and on a couple of unrelated PCFs, the patterns repeat. AI generation is reliably good at:

Rendering tests. “Component renders without crashing” with a variety of valid prop combinations. AI will systematically vary one prop at a time and produce coverage of the rendering matrix faster than I would write it by hand.

Empty and loading states. Asking the AI to enumerate states (loading, empty, error, data, denied) and produce one render test per state is one of the higher-leverage prompts. Easy to verify, easy to maintain.

Basic prop variants. Boolean toggles, enum-style props, optional vs. required fields. AI is good at the exhaustive grid here.

Pure utility functions. Anything that takes input and returns output with no side effects. Date formatters, value parsers, CSV escapers, sort comparators. The contract is small, the inputs are enumerable, and the AI rarely gets confused.

Snapshot-style assertions on simple presentational components. Helpful as a regression net even though they break easily on intentional change.

What AI Tends to Stumble On

The categories where output needs the most rework, in order of how often I see the issues:

Async cleanup and cancellation. Tests that involve AbortController, MutationObserver disconnects, or useEffect cleanup callbacks frequently come back with the right shape but wrong assertions. The AI knows the words. It does not always know the order of events. Tests pass on the green path and silently miss the cancellation path they were supposed to verify.

The PCF runtime context. The ComponentFramework.Context object is a beast. AI guesses at the shape of webAPI, parameters, mode, and utils. Sometimes it is close. Sometimes the mock has a method signature that does not exist. Sometimes a test asserts on a return shape that differs from the real RetrieveMultipleResponse.

Real Dataverse SDK calls. When a test needs to mock retrieveMultipleRecords returning a specific shape with entities, nextLink, and @odata.context, the generated mock is often missing fields the production code reads. The test passes because the assertion is shallow, but the mock would not survive a code change that exercises a missing field.

Privilege and permission edge cases. AI will write a “user has permission” test and a “user does not have permission” test. It rarely writes the “permission check API itself failed” test. That third state is real (network timeout, Dataverse 5xx, malformed response) and it has to be added by hand.

MutationObserver and DOM portals. PCFs that inject UI into the form host DOM through MutationObserver or IntersectionObserver need tests that simulate the observer firing. AI writes the observer setup but rarely writes the cleanup verification, which is where the actual leaks live.

React internals leaks. AI sometimes produces tests that assert on useState call counts or specific useEffect invocation order. These tests break on any harmless refactor and assert nothing about user-visible behavior. They need to be replaced with behavioral assertions.

Reviewing the Output

The review checklist that holds up across PCFs:

  1. 1

    Run before you read

    Let the generated suite run against your real code first. Failures are more interesting than passes. A failing test is either a bug in your code or a mistake in the test, and either way you learn something. Reading 90 passing tests in order is a way to fall asleep, not a way to find issues.

  2. 2

    Delete duplicates aggressively

    AI generates near-duplicates. Three tests that verify rendering with trivially different props can collapse to one. If two tests assert the same behavior, the one with the better edge case stays. The other goes.

  3. 3

    Verify every mock against the real type

    Open the `ComponentFramework.Context` definition or your custom types side by side with the generated mock. Every property the production code touches must exist on the mock. A mock that satisfies the compiler but lacks a runtime field is a test waiting to lie to you.

  4. 4

    Replace implementation tests with behavior tests

    If a test checks how many times a hook ran, rewrite it to assert on what the user sees. The DOM the user reads. The text the screen renders. The button states they can click. That is the contract worth defending.

  5. 5

    Add the cases AI missed

    After the review, look at coverage gaps. The remaining holes are usually in the categories above: async cleanup, runtime context mismatches, third-state permission paths, observer disconnects. Write those by hand. They are where domain knowledge earns its keep.

  6. 6

    Run coverage, look at the gaps, do not chase 100%

    AI-generated suites typically land at 70-80% line coverage on the first useful pass. The last 20% is rarely worth chasing exhaustively. Focus on uncovered branches in error handling and lifecycle, not on uncovered logging statements.

A Few Patterns Worth Stealing

These are excerpts from the FieldAuditHistory suite. The full files are visible on GitHub at vp365ai/field-audit-history under pcf/FieldAuditHistory/__tests__/.

Behavioral assertion on a callout component. No internals, just what the user reads and clicks.

it('renders the field name and the latest audit entry', () => {
  render(
    <QuickPeekCallout
      fieldLogicalName="emailaddress1"
      entries={sampleEntries}
      onOpenDeepDive={jest.fn()}
    />
  );
  expect(screen.getByText('Email')).toBeVisible();
  expect(screen.getByText(/Changed by/)).toBeVisible();
});

Cancellation behavior on a stale request. This is the kind of test AI writes the scaffolding for but needs human review to actually assert the right outcome.

it('drops a stale response when a newer request started', async () => {
  const { result } = renderHook(() => useQuickPeek());

  act(() => result.current.open('emailaddress1'));
  act(() => result.current.open('telephone1'));

  await flushPromises();

  expect(result.current.fieldLogicalName).toBe('telephone1');
});

Permission states as a triple, not a pair. granted, denied, error are three different user experiences. Test all three.

describe('privilege fallback', () => {
  it.each([
    ['granted', 'shows audit data'],
    ['denied',  'shows the access message'],
    ['error',   'shows retry'],
  ])('when privilege check is %s, %s', async (state) => {
    mockPrivilegeCheck(state);
    // ...
  });
});

The pattern across all three: the test name reads like a sentence. If you cannot read the test name and predict the assertion, the test name is wrong.

A Note on the ROI Conversation

There is a real productivity case to make for AI-assisted test generation. There is also a temptation to dress that case up with manual-baseline numbers nobody actually measured. I am going to skip the dressing.

What is true and easy to verify:

  • A useful first-pass test suite for a non-trivial component arrives in minutes, not hours.
  • The review pass is real work and should be budgeted as real work.
  • The bugs that surface during review are usually in code paths the human author did not write tests for, because if those paths had felt important they would have been tested already.
  • Coverage tooling, run after the review pass, is the honest signal of where the suite still has holes.

What is not honest:

  • Quoting a “manual baseline” of 20-25 hours when the manual baseline never ran.
  • Counting bugs found without linking to the commits or issues that fixed them.
  • Treating the AI as an autonomous test author. It is a fast first-draft author. The reviewer is still you.

Getting Started

Minimum setup for any React-based PCF:

npm install --save-dev jest ts-jest @testing-library/react \
  @testing-library/jest-dom @types/jest jest-environment-jsdom

A jest.config.ts that handles the common pain points:

export default {
  preset: 'ts-jest',
  testEnvironment: 'jsdom',
  moduleNameMapper: {
    '\\.(css|less|scss)$': 'identity-obj-proxy',
  },
  setupFilesAfterEach: ['@testing-library/jest-dom'],
};

A handwritten mock factory for the PCF context. This is the file AI cannot help you with, because the runtime shape is not in its training data in any reliable form. Write it once, import it everywhere.

// __tests__/helpers.ts
export const createMockContext = (overrides = {}) => ({
  parameters: {},
  webAPI: {
    retrieveMultipleRecords: jest.fn(),
    retrieveRecord: jest.fn(),
    updateRecord: jest.fn(),
  },
  utils: {
    getEntityMetadata: jest.fn(),
  },
  mode: {
    isControlDisabled: false,
    isVisible: true,
  },
  ...overrides,
});

Then feed your component file, your types, your data sample, and one passing test into the AI. Ask for tests in the categories above. Run them. Review every assertion. Fix what is wrong on either side of the assertion.

The Honest Pitch

AI-assisted test generation is not a magic compiler from “no tests” to “good tests.” It is a way to remove the activation energy of the blank file. The blank file is the reason most PCFs ship with no tests. The blank file is also where the first hour of any test-writing session goes.

If you can replace that hour with ten minutes of prompting and three hours of focused review, you end the day with a real suite instead of an empty __tests__/ folder. That is the trade. It is worth taking. It is not worth dressing up.

Stay in the loop

Get new posts delivered to your inbox. No spam, unsubscribe anytime.

Related articles