Quadratic interior-point methods in statistical disclosure control

被引:10
作者
Castro, Jordi [1 ]
机构
[1] Univ Politecn Cataluna, Dept Stat & Operat Res, Pau Gargallo 5, E-08028 Barcelona, Spain
关键词
Interior-point methods; Quadratic Programming; Large-scale programming; Statistical confidentiality; Controlled perturbation methods;
D O I
10.1007/s10287-004-0029-2
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
The safe dissemination of statistical tabular data is one of the main concerns of National Statistical Institutes (NSIs). Although each cell of the tables is made up of the aggregated information of several individuals, the statistical confidentiality can be violated. NSIs must guarantee that no individual information can be derived from the released tables. One widely used type of methods to reduce the disclosure risk is based on the perturbation of the cell values. We consider a new controlled perturbation method which, given a set of tables to be protected, finds the closest safe ones - thus reducing the information loss while preserving confidentiality. This approach means solving a quadratic optimization problem with a much larger number of variables than constraints. Real instances can provide problems with millions of variables. We show that interior-point methods are an effective choice for that model, and, also, that specialized algorithms which exploit the problem structure can be faster than state-of-the art general solvers. Computational results are presented for instances of up to 1000000 variables.
引用
收藏
页码:107 / 121
页数:15
相关论文
共 28 条
[1]  
Ahuja R.K., 1993, NETWORK FLOWS THEORY
[2]  
Bacharach M, 1966, MANAGE SCI, V9, P732
[3]  
Carvalho F.D., 1994, J AM STAT ASSOC, V89, P1547
[4]  
Castro J, 2003, INT FED INFO PROC, V130, P199
[5]  
Castro J., 2002, Inference Control in Statistical Databases. From Theory to Practice. Revised Papers from Seminar `Statistical Disclosure Control: From Theory to Practice' (Lecture Notes in Computer Science Vol.2316), P59
[6]   A specialized interior-point algorithm for multicommodity network flows [J].
Castro, J .
SIAM JOURNAL ON OPTIMIZATION, 2000, 10 (03) :852-877
[7]  
Castro J, EUROPEAN J OPERATION
[8]  
Cox L.H, 2002, CONFIDENTIALITY DISC, P167
[9]   Network models for complementary cell suppression [J].
Cox, LH .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (432) :1453-1462
[10]  
Dandekar R.A, 2002, SYNTHETIC TABULAR DA