Another metaphor will help.
Imagine your company owns either a shed, a house, a multi-family complex, or a high-rise. The shed owner probably doesn’t spend much time thinking about engineering permits or maintenance since they don’t need to worry about services like heating, running water, or electricity. And if the shed won’t be upgraded anytime soon, there's no need to plan for future changes.
The house owner will need to devote significantly more effort to keeping things running; even so, they won’t need to concern themselves with more than their own and their family's needs.
A multi-family complex requires more planning, more resource integration, and a more structured approach to handling issues as they arise; and different tenants may have very different needs.
And a high-rise tower will have a completely different set of engineering requirements, permitting requirements, safety requirements, and so on.
The high-rise will also be much more complicated to manage, particularly if things like fixtures have not been standardized throughout the building.
See a trend here? The more complicated the structure, the greater the need for better infrastructure. And earlier planning becomes even more important for bigger buildings, as early mistakes and omissions can become costly later on.
The same holds true for data engineering within a company. While the smallest companies may not immediately need tools and structures to support data access, companies with growth aspirations will need to consider this issue. And the longer they wait, the more expensive the effort will be, with more work required to fix past errors and omissions.
Data governance is the work of setting your company's policies on how it collects data. It is tightly integrated with data strategy, which includes defing and implementing data governance rules.
A solid data governance policy will not only identify which sources to consolidate, but will also define who can access data, when data is pulled, and what to do if conflicts occur. Governance policy also addresses other issues, such as retention and integrating new data sources as they come online.