Data Classification: Key to Unlocking AI Potential
As North Carolina prepares to enhance its data for artificial intelligence (AI) applications, the state’s chief privacy officer emphasizes that proper data identification is the crucial first step toward realizing AI’s full potential.
The Importance of Data Classification
Martha Wewer, who has been North Carolina’s chief privacy officer since May 2025, is spearheading a data classification project aimed at helping the state better identify its data. This initiative is significant because, as Wewer notes, data classification involves knowing exactly what data you have, where it lives, and how sensitive it is. This understanding is vital for implementing effective privacy and security controls, which are paramount in minimizing risks to data privacy.
Current State and Future Goals
Wewer humorously remarks on her passion for data classification, indicating its importance as the state moves toward increased AI integration. While North Carolina has been effective in protecting its data, proper classification will enhance the state’s ability to discern which datasets are suitable for AI tools, which require aggregation or anonymization, and which have sufficient quality for reliable usage.
This project is crucial to state leadership, as evidenced by North Carolina Governor Josh Stein’s executive order last fall, which established a statewide AI framework, an AI leadership council, and an AI accelerator at the Department of Information Technology.
Data Classification Process
Wewer states, “Our governor has said that he wants to build safe and trustworthy AI use within the state.” Identifying and labeling sensitive data to prevent potential data leaks is her top priority. Data classification begins with creating an inventory of data, wherein each piece is classified as public, internal, confidential, or highly sensitive. This classification enables the state to formulate specific rules that govern data usage, such as determining which datasets are appropriate for AI applications.
Impact of Data Classification on Privacy and Security
Data classification is often performed alongside a privacy threshold analysis, which helps identify personally identifiable information. This process drives the establishment of various protective measures, including access controls, encryption, retention periods, logging, and vendor requirements. Without proper labeling, sensitive information can be inadvertently exposed or misused in AI training.
Growing Significance as AI Integration Expands
Wewer underscores that the stakes are rising as AI becomes increasingly integrated into government operations. There exists a significant interest among stakeholders to understand how privacy and data protection align with AI technologies. Many individuals within the state’s privacy culture are eager to leverage AI tools but harbor concerns about data privacy, which she views positively as a sign of thoughtful consideration.
Collaboration for Governance Frameworks
The chief privacy officer collaborates with I-Sah Hsieh, North Carolina’s deputy secretary for artificial intelligence and policy, to address the intersections of privacy, AI, and cybersecurity governance. Wewer acknowledges, “We didn’t have a lot of guidelines when people started talking about privacy,” which necessitated the creation of custom frameworks outside existing regulatory structures like HIPAA. The governance applied to state-collected data must also encompass AI frameworks, highlighting the strategic collaboration among Wewer, Hsieh, and Chief Information Security Officer Bernice Russell-Bond.
This comprehensive approach to data classification and governance is not only essential for safeguarding sensitive information but also for fostering responsible AI innovation within the state.