Race & Ethnicity Data : Developing a Common Language for Public Health Surveillance in Hawaii

Race and ethnicity are key variables used in the field of public health surveillance for monitoring and tracking health status and outcomes of populations. However, over the last decade the collection and use of these traditional variables has come under scrutiny. Central to the arguments are the manner in which race and ethnicity are conceptualized and the lack of standards in terms of how the data elements are defined. This paper describes some of the challenges to collecting and reporting race and ethnicity data in a state that has a large and growing multi-racial and multi-ethnic population base. A model is presented that describes a methodology for incrementally clustering very discrete ethnic sub-populations to ethnic subgroups and eventually to the Office of Management and Budget five race classifications. © 2003 Californian Journal of Health Promotion. All rights reserved.


Introduction -The Importance of Ethnicity & Race Data for Public Health
Race and ethnicity data have been collected for more than 200 years in the U.S. in order to provide denominators for social and demographic analysis.Prior to 1930 "race" was ascribed to national groups that were more often than not immigrants (e.g., Hebrews, Italians, Celts).Such groups along with the American Indians and Blacks were regarded as quite apart from the White founders.Racial distinctions were used to support "scientific" evidence that "morals, physical and intellectual capacity were inherited" (Oppenheimer, 2001).
Today ethnicity and race data continue to be collected to frame and describe public health issues.In one of the first reports on health disparities, Du Bois described differences between African-American and White populations in the early 1900's (Du Bois, 1906).Because disease, injury, access to prevention and treatment services, cost of care, quality of life, and other public health concerns persist disproportionately affecting some ethnic and race populations over others, the Healthy People 2010 Initiative adopted as one of its three overarching goals the elimination of health disparities (US DHHS, 2000).Ethnicity and race are important determinants in health patterns, whether representing true biological and genetic differences or a set of factors that affect health and health status.
The last 15 years have seen a rise in the criticism, particularly in the U.S., of the collection and use of race and ethnicity data.U.S. experts in the natural and social sciences agree that the biological concept of race has no scientific basis (CDC & P, 1993;NCI, 2000).The Institute of Medicine argues that "in all instances race is a social and cultural construct" with its base in "perceived differences in biology, physical appearance, and behavior (Haynes & Smedley, 1999, p.38).In fact, data show there are substantially more genetic variation within "races" than between them (NCI, 2000).

Challenges Using Race and Ethnicity
There are two key issues regarding race and ethnicity.First is the manner in which the terms are conceptualized and implemented, and second is the range of definitions used to cluster population groups.The heart of the challenge is that race and ethnicity are not simply terms but are concepts.They are complex and personal concepts that have context and, as noted, that context is a "product of social and political history" CDC & P, 1993).The shape of one's eyes or nose, the texture of one's hair are not accurate or legitimate methods for classifying people into meaningful racial subgroups.As a biological or genetic measure race may suggest genetic homogeneity across population groups.In truth, these populations are heterogeneous.Asian, Black, White, Native American are examples of genetic admixtures of geographic stock from all over the world.Vietnamese, Pakistani, and Filipino are all Asian however each is derived from a different geographic genetic stock.The same fact applies to persons classified White.The last two decades in particular have seen a heightened sensitivity to differences among minority ethnic groups.On the other hand, White is such a ubiquitous and familiar term the diversity within the White race is not even regarded.White is inclusive of such diverse groups as Scottish, Greek, Spanish, Canadian, Iranian, and Moroccan.Clustering or aggregating diverse population groups to a single race category denies or ignores genetic variability.
Use of race and ethnicity categories also implies cultural homogeneity for such matters as health beliefs, dietary practices, and physical activity.If in fact culture and health are linked then the presumption might be made that health risks and outcomes would be homogenous.Data indicate otherwise.For example, breast cancer incidence rates for Asian-American ethnic groups are not the same (Deapen, Liu, Perkins, Bernstein, & Ross, 2002).In the U.S. breast cancer risk among women of Japanese and Filipino ancestry is twice that of Chinese and Korean women.These four individual ethnic groups are all Asian, yet each has its own distinct culture.
Classifying by race and ethnicity suggests that racial or ethnic identities are static and mutually exclusive.Research shows that these data are fluid as people can and do change how they selfreport their race or ethnicity for political and economic reasons.Parents may report detail about their children's racial and ethnic background, particularly for multi-racial births.As these children age and, in particularly those who leave home, may simplify how they describe their racial and ethnic background (Walters, 2000).
These challenges have given rise to arguments that the collection and use of race and ethnicity should be discontinued.The proposals have suggested emphasizing socioeconomic data and other "life factors" including health insurance status, geography (e.g., place of residency and length of residency in U.S.), personal/family income status, religion, personal health beliefs and practices (Fullilove, 1998;Krieger, 2000;NCI, 2000).While acknowledging the salience of such arguments others contend that race and ethnicity information should be expanded.The increase in such collection would be done in concert with a broad awareness of the implications of using race and ethnicity (La Veist, 1996;Thomas, 2001;Willams, Laviszzo-Mourney, & Warren, 1994).
Other recommendations include providing detailed guidelines to address issues such as multiple response sets, information on the collection method (self-report or observation), and study design (Jones, 2001;Kaufman & Cooper, 2001).
Despite the widespread practice of asking a person about his/her race and/or ethnicity the charge to collect these data is not universal.Many European countries are prohibited constitutionally or legislatively to collect these data though actual practices do vary among the European nations.In the U.S. public health organizations and private entities voiced major concerns with a California ballot initiative (Proposition 54) which would have restricted state and local government offices from collecting information on a person's race, ethnicity, color, or national origin for certain purposes.The initiative was defeated.
Because the use of race and ethnicity is rooted in public health surveillance, the likelihood the collection and use of these descriptors will be abandoned anytime soon is remote.As one epidemiologist expressed -"This is what we are taught to do --" (Jones, 2001).A critical reason for continuing to collect these data is health disparities do exist between different racial and ethnic groups.The challenge is differential patterns of disease, risk factors, health beliefs and practices, and access to services across racial and ethnic populations are poorly understood and what is known is incomplete.

Federal Reporting Standards
The U.S. Census is an essential data source for information identifying trends and changes in economic, social, and health characteristics by race.Unfortunately, the method used to gather Census data has not been stable with almost each decade experiencing changes in how the data are collected.It was not until the mid 1970's that the federal government began a concerted effort to develop and implement a common language for reporting ethnicity and race.The impetus to develop a standard was the need for comparable data to monitor equality issues including the availability of and access to health care services, employment opportunities, education, and housing for population groups that experienced discrimination.
Since 1977, the Office of Management and Budget's (US OMB, 1997) Statistical Policy Directive 15 provided the common language promoting uniformity and comparability for collecting and reporting of ethnicity and race data.The standard recognized four categories for race data (American Indian or Alaskan Native, Asian and Pacific Islander, Black, and White) and two categories of ethnicity data ("Hispanic origin" and "Not of Hispanic origin").In 1997 the OMB announced revisions to Directive 15 to address the need for more refined data that would take into account demographic changes including growth in immigration and increase in interracial marriages (US OMB, 1997).The revised standards established a new fifth race category -Native Hawaiian and Pacific Islander.
The current race category standards are: • American Indian/Alaskan Native: A person having origins in any of the original peoples of North and South America (including Central America) and who maintains tribal affiliation or community attachment.Undoubtedly the most significant change was the federal guideline that required individuals be given the opportunity to select one or more races.For the first time in the history of the U.S., citizens were allowed to check more than one box to identify their race.Agencies are encouraged to not only collect multiple responses but even more detailed information on specific racial combinations as long as the data reliability and confidentiality concerns can be met.
The preferred federal data collection method uses a two-question format that asks a person to first declare their ethnicity then race.In instances where the data will be obtained from observer-collected methods, a combined format may be employed.The combined format makes use of six minimum categories: Regardless of the collection method there are no criteria or qualifications such as blood quantum to "prove" race affiliation.The responses are based on self-perception and are by definition accurate.
Unless a person intentionally misreports their ethnicity and/or race, their response is not "wrong" even if clinical tests would indicate otherwise.
The Office of Management and Budget acknowledges that these categories are "neither anthropologically nor scientifically-based, but rather represent a social-political construct designed for collecting data on race and ethnicity of broad population groups in the U.S." (US OMB, 1997).
It is important to note that OMB does not mandate the collection of race and ethnicity data, but rather provides the standards or guidelines by which these data are collected and reported.The OMB standards address the minimum amount of information that should be obtained thereby assuring the data will be collected in a prescribed and consistent manner.These standards do not apply for states and private industry.The exception is state agencies that receive federal funds.
Agencies and organizations that are not required to use the standards have a reason to adopt them.For example, public health and social service organizations are dependent on data to describe their population(s) and empirically demonstrate a need.Often local data are used in comparison to national data to substantiate a problem or need.These data are usually expressed as rates.The Census is the source of population denominators for rate calculation used by federal offices and other state systems allowing for national comparisons.

Race and Ethnicity Data Standards
Race and ethnicity are central to public health surveillance activities and programs and the demand for detailed data continues to grow locally and nationally.How are these terms understood within the professional community?Are they considered synonymous and thus used interchangeably?Are race and ethnicity in fact one and the same?Does it really make a difference if the terms are transposed?What is the gain if these terms are standardized?
The Merriam-Webster dictionary defines "race" as: "a family, tribe, people, or nation belonging to the same stock; a division of mankind possessing traits that are transmissible by descent and sufficient to characterize it as a distinct human type."This definition suggests there is a genetic or biological link to phenotypic traits such as skin color and facial features."Ethnic" is defined as: "a member of a minority group who retains the customs, language, or social views of the group.""Ethnicity" is thus associated with cultures, behavioral attitudes, beliefs, lifestyle patterns, diet, and environmental living conditions.The dictionary definitions indicate that "race" and "ethnicity" are not synonymous.The reality is that often in the day-to-day of public health these terms are likely to be used more casually.
The Hawaii Department of Health (DOH) does not employ a standardized method to define, collect or report race and ethnicity data resulting in a number of special challenges.Appendix A provides a dramatic visual representation of the problems.A sample of 11 data sets identified approximately 100 separate race and ethnic categories highlighting such issues as inconsistency in spelling (e.g., Belauan/ Palauan), inconsistent combining of groups (e.g., Samoan and Samoan/ Tongan), and formatting differences (American Indian/ Alaskan Native and American Indian or Alaskan Native).The disparity across the descriptors and collecting practices are largely based on past practices and are often linked to funding agency guidelines (e.g., federal).While many DOH programs collect ethnic-specific data these data are reported at an aggregate level.The chance to conduct in depth analyses on specific ethnic populations is lost.The lack of data conformity results in lost opportunities to mine the data to provide new information.An important goal is to have processes in place whereby race and ethic data are conformed (standardized) in terms of how they are defined, collected, and aggregated ensuring the accuracy and usefulness of the information.
A related challenge is the practice of collecting race and ethnicity information that is limited in its usefulness.It would be beneficial to establish guidelines to minimize the use of general categories and improve the initial data collection process.
To nurture the critical issue of standardizing key data elements and to ensure clarity and continuity for readers this paper has adopted the dictionary definitions of race and ethnicity.The terms "race" incorporates: American Indian or Alaska Native, Asian, Black or African-American, Native Hawaiian or Other Pacific Islander, and White."Ethnicity" references discrete population groups associated by geography, culture or language (e.g., Japanese, Chuukese, Fijian).

Hawaii Department of Health -Vital Statistics
The Office of Health Status Monitoring (OHSM) is one of the key offices within the DOH.Much of the data it collects is used across many Divisions.OHSM has statutory authority and the responsibility under Chapter 38, Hawaii Revised Statute to collect and report all birth and death events.The birth certificate generates a wealth of information (e.g., maternal and infant medical risk factors, presence of congenital abnormalities, obstetrical procedures).It is also an important source for race and ethnicity information.Decision rules established by the DOH in the 1940's are used to assign infant race and ethnicity as reported by the parent(s).Infant ethnicity is derived from the ethnicity of the father.If that is unknown or not reported, then the ethnicity of the mother is used to determine the infant's classification.The decision rules allow parents to report multiple ethnicities on the birth certificate form; however, only one ethnicity is captured electronically from the certificate.The following rules apply if more than one ethnicity is listed:  (Hahn, 1999).This situation is particularly problematic for persons of multiethnicities.
A major consideration in using birth and death data is these data are collected for administrative purposes.As such their application in public health surveillance, research, policy development, and program evaluation can be challenging as the administrative goals may not parallel these other needs.

Hawaii Race and Ethnicity Model
In 1999, the DOH leadership envisioned a data warehouse that would serve the Department's data needs.The goals of the Hawaii Health Data Warehouse were to: coordinate resources across Administrations and Divisions, ensure consistency of the public health data with national recommendations, and increase public access to health data.Foremost in the design vision was integrating data from disparate data sets to enrich the information.

Integrating
data requires that critical demographic data elements (i.e., race, ethnicity, age, gender, and geography) be available in a standard or common format.As the warehouse has moved from concept to reality, technological requirements have heightened awareness and driven the need to address a number of data issues.While the OMB standards provide a base for defining race, collecting and reporting ethnicity is a more complex matter.
A DOH working group was formed to research methodologies and practices for collecting and reporting race and ethnicity data.Five important criteria were identified.The model must: • Provide for the continued collection of "program-level" data to support communitybased planning and decision-making, and identify health disparities; • Identify a process whereby ethnicity data could be clustered using a set of standards ensuring that individual ethnic groups are exclusive to a single aggregated racial group; • Ensure federal reporting standards are met; and • Increase the capacity to provide populationbased (rate) information as long as sufficient numerator and denominator data are available.
The Centers for Disease Control and Prevention and the Agency for Toxic Substances and Disease Registry (CDC/ATSDR) produced a document titled, "Common Data Elements -Implementation Guide" (version 2.4) which proposed standard data elements for use in health information and surveillance systems (US DHHS, 2000).While incorporating the 1997 OMB race standards, the guide also disaggregates each racial group into discrete ethnic sub-categories.This design forms a base for the Hawaii model (see Appendix B).
The Hawaii model distinguishes between race and ethnicity.Aggregated at the highest level are the five OMB categories.These race groups are disaggregated into different levels of specific ethnic populations based on the size of each discrete group in Hawaii.These selected ethnic groups are those most commonly reported by the DOH for public health surveillance.The most discrete ethnic groups listed, including the "unspecified" and "other" categories, are currently in use by different DOH programs.
The Hawaii model is a living or dynamic model.As new information becomes available or data needs change the model will be revised.Because there is no DOH requirement that data elements be standardized, adoption and application of the model will be implemented through consensus.

A Special Challenge
The collecting and reporting of multi-race and mixed ethnicity data has long been an issue in Hawaii because of diverse population groups.In 1995, 15,407 births occurred in Hawaii with 7,843 (58.1%) reported as mixed ancestry.In 2000, the percent jumped to 61% with 7,990 of 13,120 babies identified as mixed ancestry (State of Hawaii, 2001).Nationally the percent of interracial births for the period 1991-1995 was just under 4% (Atkinson, Macdorman, Parker, 2001).In California, a state also known for a diverse population mix, less than 2% of mothers reported more than one race on their child's birth certificate for 2000 (Heck et al., 2001).
Despite having a large population reporting multiple ethnicities, technical and programmatic issues preclude many DOH divisions and programs from reporting this information, even though it is collected.Programs that do attempt to report their multi-race or mixed ethnicity data use an assortment of methodologies, including: • Assigning the person to a single race group using the OMB race categories; • Requiring (forcing) individuals to self-select a single race or ethnicity; or  (Baker et al., 1999).Analyses showed that "the reporting of health outcomes may differ drastically depending on the method of OMB coding used for multiple race.""Health outcomes, such as asthma and hypertension, may differ significantly depending if individuals were coded full or part race."The prevalence of asthma among Filipinos varied from 6% to over 16% depending on which method was applied.

Recommendations
Public health surveillance is the cornerstone of public health practice.Surveillance data, of which race and ethnicity variables are a part, facilitate identifying patterns of health, disease, and personal/group health behaviors.While acknowledging this as true, we find ourselves dealing with an opposite truth.We complain about the dearth of available ethnic-specific data noting that without it the true health status among subpopulations is inaccurate, inadequate, and ill-defined.Deficiently detailed race data also limit our ability to report rates, following trends, and identify patterns in health problems.We struggle with the lack of consensus how race and ethnicity are defined and measured.We acknowledge that differences in terminology and data collection procedures may affect the reliability and validity of the analyses.We are affected by the changing demographics of the U.S. and Hawaii's population (i.e., increase in multi-race births) and look to methods of accurately using these data.
On the other hand we debate if by continuing to report out racial and ethnic difference in health outcomes we are maintaining stereotypes, or worse, racist notions.The debate similarly questions whether science supports a relationship between race and ethnicity variables to health outcomes.We ask what role these factors should play in influencing policy, program development, planning, and evaluation.Outside of reporting requirements and simple descriptive statistics, how are these data applied within DOH programs?
Anecdotal reports indicate that some DOH programs do make use of their race data by ensuring that education materials and contact strategies are culturally sensitive and appropriate.Should such practices be applied throughout the DOH or are there special circumstances where it would be appropriate?If a reason for collecting these variables is all races have unique experiences then it follows that single race and mixed-race groups must be treated as distinct.Considering that Hawaii's population has a large percent of persons that self-identify more than one race begs the question, "Which race should be preeminent when designing prevention and intervention strategies to reduce health 8. Document if the information is based on self-report or observation, and work to eliminate (when feasible and appropriate) observer-derived measures.
disparities"?What is the importance of culture, geography, nativity, language competency…?
As the public health profession considers these issues a number of recommendations are proposed for Hawaii: 9. Document if multiple responses are allowed.10.Eliminate the use of vague and imprecise descriptors (Other, Mixed, Not Sure, Unknown, etc.). 1. Continue to collect race and ethnicity data as long as differences in health outcomes are a reality with the aim of eliminating the disparities.
2. View race and ethnic data as clues to be mined, not ends in themselves.3. Consider using a scale that measures factors such as income, residency, education, nativity, language proficiency in conjunction with race and ethnicity.
As scale development and validation is an academic activity a partnership with the University of Hawaii might be considered.These recommendations will help establish DOH-wide standards for collecting, defining, measuring and reporting race and ethnicity data.When using data, care must be taken to not imply there is an association between an individual's or group's health status and their race or ethnicity.Health outcomes are the result of a complex relationship of diverse factors including but not limited to genetic/inherited traits, personal health behaviors, access to care, quality of available health care, and socioeconomic status.Careful and complete data interpretations are critical as the application can be used to redirect resources, close or start programs, and affect health policy.
4. Be precise when using race and ethnicity by not using the terms interchangeably.5. Review current ethnic categories and their usefulness in generating information for programs.6. Review mixed ethnicity data collection method and definition measures.7. Review method(s) employed to count or classify mixed ethnicity data.
• Asian: A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent, including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam.• Black/African-American: A person having origins in any of the black racial groups of Africa.• Native Hawaiian/Other Pacific Islander: A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands.• White: A person having origins in any of the original peoples of Europe, the Middle East, or North Africa.
The 1997 standards retained the two ethnic categories: (1) Hispanic or Latino; and (2) Not Hispanic or Latino with Hispanic defined as: A person of Cuban, Mexican, Puerto Rican, South or Central American or other Spanish culture or origin, regardless of race.
Hawaiian is one of the multiple ethnicities listed, Part-Hawaiian is coded.• If a non-Caucasian* ethnicity is listed with Caucasian, the non-Caucasian ethnicity is coded.• If there is more than one non-Caucasian ethnicity listed, the first one listed is coded.• If there is more than one Caucasian ethnicity listed, the first one noted is coded.

•
Assigning respondents to a single race group using a set of program-specific guidelines.