[Security Solution] Fix incorrect changes highlighting in diff view (#205138)

**Resolves: https://github.com/elastic/kibana/issues/202016**

## Summary

This PR resolves an issue where the diff view incorrectly marked certain
characters as changed (using bold font) in some cases.

## Root Cause
The issue arises from a bug in the `diff` library (v5). The library is
used to compute two-way diffs between strings (old field value and new
field value), producing an array of change objects that is then used for
rendering.

Conditions for the bug:
- `diff` v5 library is in use (fixed in v6 and above) and
`DiffMethod.WORDS` is passed as a parameter to it.
- The old field value contains a line with an addition separated by a
space (see example below).
- The next line contains some changes (additions, deletions, or
updates).


For example, for these input strings:
```
foo bar
spring
```
```
foo
sprint
```

| You would expect to see this diff | But you actually see this |
|----------|----------|
| <img width="119" alt="expected"
src="https://github.com/user-attachments/assets/c41b3dec-e578-4a12-8eb8-91fbb60d7247"
/> | <img width="118" alt="actual"
src="https://github.com/user-attachments/assets/f2a33fee-5de2-4291-876a-e7575ea07079"
/> |

**A more real-life example**
<img width="1661" alt="more_real_life"
src="https://github.com/user-attachments/assets/91ebfe93-81ad-45c8-8f9b-e173c2cf940b"
/>


## Solution
Switching to `DiffMethod.WORDS_WITH_SPACE` avoids this issue. 
Screenshot showing the difference between `DiffMethod.WORDS` and
`DiffMethod.WORDS_WITH_SPACE`:
<img width="675" alt="words_vs_words_with_space"
src="https://github.com/user-attachments/assets/3c91e1d2-63fc-4fcd-a762-a905878bfc3a"
/>

## Other changes
- Removed `DiffMethod.TRIMMED_LINES` since it's now
[deprecated](https://github.com/kpdecker/jsdiff/pull/486) in the `diff`
library and we are not using it anyways.
- Stopped using the "zip" option since I believe it produces a less
readable diff, especially for cases when there's a different number of
lines in the original value vs updated value.

<details>
<summary><strong>Screenshots: with and without "zip" (click to
expand)</strong></summary>
<strong>With the "zip" option (how it was before)</strong>
<img width="1918" alt="zip"
src="https://github.com/user-attachments/assets/272ed849-47d6-4fef-8acc-ab1b22c9f42e"
/>

<strong>No "zip" (this branch)</strong>
<img width="1919" alt="no_zip"
src="https://github.com/user-attachments/assets/417303bf-9570-4ee1-98c5-8a78f59c7956"
/>
</details>

## Testing

I thoroughly tested with `DiffMethod.WORDS_WITH_SPACE` across various
inputs and scenarios, including:
- Single-line and multi-line strings.
- Numbers, arrays, and objects.
- Additions, deletions, and updates at different positions (start,
middle, and end) within and across lines.

I also validated diffs against real prebuilt rules by installing an
older Fleet package version and observed no issues.

You can test by trying different input strings and settings in
Storybook.
**Run Storybook**: `yarn storybook security_solution`.


https://github.com/user-attachments/assets/0440b73c-a4d7-40cf-9cee-e632146d292e

You can notice that `ComparisonSide` stories are broken, but that's
unrelated to these changes and needs to be handled separately.

## Compatibility with future upgrades

There's an open [PR](https://github.com/elastic/kibana/pull/202622) that
will upgrade the `diff` library from v5 to v7. I verified the behavior
of `DiffMethod.WORDS_WITH_SPACE` on v7 and found no differences compared
to v5, so it should be safe to upgrade to v7 without any changes on our
end.

Work started on 23-Dec-2024.
This commit is contained in:
Nikita Indik 2024-12-30 13:38:42 +01:00 committed by GitHub
parent bb877cff7e
commit 140c2e0ecf
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 77 additions and 9 deletions

View file

@ -0,0 +1,60 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import React from 'react';
import type { Story } from '@storybook/react';
import type { DiffViewProps } from './diff_view';
import { DiffView } from './diff_view';
import { DiffMethod } from './mark_edits';
export default {
component: DiffView,
title: 'Rule Management/Prebuilt Rules/Upgrade Flyout/ThreeWayDiff/DiffView',
argTypes: {
oldSource: {
control: 'text',
},
newSource: {
control: 'text',
},
diffMethod: {
control: 'select',
options: [
DiffMethod.WORDS_WITH_SPACE,
DiffMethod.WORDS,
DiffMethod.CHARS,
DiffMethod.LINES,
DiffMethod.SENTENCES,
],
defaultValue: DiffMethod.WORDS_WITH_SPACE,
},
zip: {
control: 'boolean',
defaultValue: false,
},
},
};
const Template: Story<DiffViewProps> = ({ oldSource, newSource, diffMethod, zip }) => {
return (
<DiffView
oldSource={oldSource}
newSource={newSource}
diffMethod={diffMethod}
zip={zip}
viewType="unified"
/>
);
};
export const Default = Template.bind({});
Default.args = {
oldSource:
'from logs-endpoint.alerts-*\n| where event.code in ("malicious_file", "memory_signature", "shellcode_thread") and rule.name is not null\n| keep host.id, rule.name, event.code\n| stats hosts = count_distinct(host.id) by rule.name, event.code\n| where hosts >= 3',
newSource:
'from logs-endpoint.alerts-*\n| where event.code in ("malicious_file", "memory_signature", "shellcode_thread")\n| stats hosts = count_distinct(host.id)\n| where hosts >= 3',
};

View file

@ -25,7 +25,6 @@ import type {
HunkTokens,
} from 'react-diff-view';
import unidiff from 'unidiff';
import type { Change } from 'diff';
import { useEuiTheme, COLOR_MODES_STANDARD } from '@elastic/eui';
import { Hunks } from './hunks';
import { markEdits, DiffMethod } from './mark_edits';
@ -89,7 +88,7 @@ const useTokens = (
try {
/*
Synchroniously apply all the enhancers to the hunks and return an array of tokens.
Synchronously apply all the enhancers to the hunks and return an array of tokens.
*/
return tokenize(hunks, options);
} catch (ex) {
@ -128,7 +127,7 @@ const convertChangesToUnifiedDiffString = (changes: Change[]): string => {
return unifiedDiff;
};
const convertToDiffFile = (oldSource: string, newSource: string) => {
const convertToDiffFile = (oldSource: string, newSource: string, zip?: boolean) => {
/*
"diffLines" call converts two strings of text into an array of Change objects.
*/
@ -156,7 +155,7 @@ const convertToDiffFile = (oldSource: string, newSource: string) => {
Hunks represent changed lines of code plus a few unchanged lines above and below for context.
*/
const [diffFile] = parseDiff(unifiedDiff, {
nearbySequences: 'zip',
nearbySequences: zip ? 'zip' : undefined,
});
return diffFile;
@ -255,18 +254,25 @@ const CustomStyles: FC<PropsWithChildren<unknown>> = ({ children }) => {
);
};
interface DiffViewProps extends Partial<DiffProps> {
export interface DiffViewProps extends Partial<DiffProps> {
oldSource: string;
newSource: string;
diffMethod?: DiffMethod;
viewType?: 'split' | 'unified';
/*
When "zip" is set to true, the change lines will be rendered in an interlaced style.
For an example, refer to:
https://github.com/otakustay/react-diff-view/blob/8a2dbdf97af0890aff6e563ed435e7da13c5e7b1/README.md#parse-diff-text
*/
zip?: boolean;
}
export const DiffView = ({
oldSource,
newSource,
diffMethod = DiffMethod.WORDS,
diffMethod = DiffMethod.WORDS_WITH_SPACE,
viewType = 'split',
zip = false,
}: DiffViewProps) => {
/*
"react-diff-view" components consume diffs not as a strings, but as something they call "hunks".
@ -277,7 +283,10 @@ export const DiffView = ({
/*
"diffFile" is essentially an object containing an array of hunks plus some metadata.
*/
const diffFile = useMemo(() => convertToDiffFile(oldSource, newSource), [oldSource, newSource]);
const diffFile = useMemo(
() => convertToDiffFile(oldSource, newSource, zip),
[oldSource, newSource, zip]
);
/*
Sections of diff without changes are hidden by default, because they are not present in the "hunks" array.

View file

@ -38,7 +38,6 @@ export enum DiffMethod {
WORDS = 'diffWords',
WORDS_WITH_SPACE = 'diffWordsWithSpace',
LINES = 'diffLines',
TRIMMED_LINES = 'diffTrimmedLines',
SENTENCES = 'diffSentences',
CSS = 'diffCss',
}
@ -262,7 +261,7 @@ function diffChangeBlock(
* The format of the strings is as follows:
*/
export function markEdits(hunks: HunkData[], diffMethod: DiffMethod): TokenizeEnhancer {
/*
/*
changeBlocks is an array that contains information about the lines that have changes (additions or deletions).
Unchanged lines are not included.
*/