《计算机科学》相关教学资源(参考文献)Sampling & Counting for Big Data

Sampling Counting for Big Data 南京大学 尹一通 2019年全国理论计算机科学学术年会 2019年8月3日于兰州大学
Sampling & Counting for Big Data + 2019ଙࢵقቘᦞᦇᓒᑀଙտ 2019ଙ8์3෭ԭهय़

Sampling vs Counting [Jerrum-Valiant-Vazirani86]:for all self-reducible problems approx counting exact sampling vol(R) X=(X1,X2,,X) Poly-Time Turing approx inference X~2 Machine Pr[X,=·|X=o] DOMIZED RANDOM GENERATION OF COMBINATORIAL STRUCTURES RITHMS FROM A UNIFORM DISTRIBUTION Mark R.JERRUM Department of Computer Science,University of Edinburgh,Edinburgh EH9 3JZ United Kingdom Leslie G.VALIANT Aiken Computation Laboratory,Harvard University,Cambridge,MA 02138,U.S.A. Vijay V.VAZIRANI** Computer Science Department,Comell University,Ithaca,NY 14853,U.S.A
Sampling vs Counting sampling exact approx { } X = (X1, X2, …, Xn) approx counting vol(Ω) approx inference Pr[Xi = ⋅ ∣ XS = σ] ( [Jerrum-Valiant-Vazirani ’86]: Poly-Time Turing Machine for all self-reducible problems X ∼ Ω

MCMC Sampling Markov chain for sampling=(X1,X2,...,X) Gibbs sampling(Glauber dynamics,heat-bath) pick a random i; [Glauber,'63] resample Xi~u(·W(); [Geman,Geman,'84] Metropolis-Hastings algorithm pick a random i; propose a random c; [Metropolis et al,'53] X=cwp.∝4(X)/u: [Hastings,'84] Analysis:coupling methods [Aldous,'83][Jerrum,'95][Bubley,Dyer '97] may give O(n log n)upper bound for mixing time
MCMC Sampling • Gibbs sampling (Glauber dynamics, heat-bath) • Metropolis-Hastings algorithm Markov chain for sampling X = (X1, X2,…, Xn) ∼ μ pick a random i; resample Xi ~ µv( · |N(v)); pick a random i; propose a random c; Xi = c w.p. ∝ µ(X’)/µ(X); [Glauber, ’63] [Geman, Geman, ’84] [Metropolis et al, ’53] [Hastings, ’84] • Analysis: coupling methods [Aldous, ’83] [Jerrum, ’95] [Bubley, Dyer ’97] may give O(n log n) upper bound for mixing time

Computational Phase Transition hardcore model:graph G(V,E),max-degree A,fugacity >0 approx sample independent set in G w.p. λ(△)= (△-1)(△-1) [Weitz,.STOC'06:lfλλc, NP-hard even for A=O(1). [Efthymiou,Hayes,Stefankovic, Vigoda,Y.,FOCS'16]: max-deg△ lfλ<入c,O(n log n)mixing time. If A is large enough,and there is no small cycle. A phase transition occurs at Ac
Computational Phase Transition hardcore model: graph G(V,E), max-degree Δ, fugacity λ>0 approx sample independent set I in G w.p. ∝ λ|I| c() = ( 1)(1) ( 2) 2 4 6 8 10 1 2 3 4 5 6 Hard Easy max-deg Δ λ • [Weitz, STOC’06]: If λλc, NP-hard even for Δ=O(1). [Efthymiou, Hayes, Štefankovič, Vigoda, Y., FOCS’16]: If λ<λc, O(n log n) mixing time. If Δ is large enough, and there is no small cycle. A phase transition occurs at λc

Sampling Counting [Jerrum-Valiant-Vazirani'86]: for all self-reducible problems approx counting exact approx sampling vol() X=(X1.X2.....X) Poly-Time Turing approx inference X~2 Machine Pr[X=·|X=o FON CNITORN ON OCNONMTORAL STUCTU Mark R.JERRUM MCMC Sampling Computational Phase Transition Markov chain for sampling =(X1.X2.....X)~ hardcore model:graph G(V,E).max-degree A,fugacity >0 ● Gibbs sampling (Glauber dynamics,heat-bath) approx sample independent set Iin Gw.p. pick a random i: [Glauber,'63] X(△)= (4-1)a-1) [Weitz,STOC'06]:If <c,no(log A)time. resample~4,(·N(v)h [Geman,Geman,'84] (△-2)A [Sly,FOCS'10 best paper]:If Metropolis-Hastings algorithm NP-hard even for A=0(1). pick a random i: [Metropolis et al,'53] Hard [Efthymiou,.Hayes,Stefankoviě, propose a random e; X-cwp.uxu(); [Hastings,'84] Easy Vigoda,Y.,FOCS'16]: If <he,O(n log n)mixing time. Analysis:coupling methods max-deg△ If A is large enough.and there is no small cycle. [Aldous,'83][Jerrum,'95][Bubley,Dyer'97] A phase transition occurs at e. may give O(n log n)upper bound for mixing time Big Data?
Big Data?

Sampling and Inference for Big Data ●Sampling from a joint distribution (specified by a probabilistic graphical model). ● Inferring according to a probabilistic graphical model. The data (probabilistic graphical model)is BIG
Sampling and Inference for Big Data • Sampling from a joint distribution (specified by a probabilistic graphical model). • Inferring according to a probabilistic graphical model. • The data (probabilistic graphical model) is BIG

Parallel/distributed algorithms for sampling? ·PTIME→Polylog(n)rounds o For parallel/distributed computing: sampling approx counting/inference? 0 PTIME=Polylog(n)rounds √●Dynamic sampling algorithms2 PTIME Polylog(n)incremental cost
• Parallel/distributed algorithms for sampling? • For parallel/distributed computing: sampling ≡ approx counting/inference? • Dynamic sampling algorithms? ✓ ✓ ✓ • PTIME ⟹ Polylog(n) rounds • PTIME ⟹ Polylog(n) rounds • PTIME ⟹ Polylog(n) incremental cost

Local Computation "What can be computed locally?" [Noar,Stockmeyer,STOC'93,SICOMP'95] the LOCAC model [Linial '87]: ● Communications are synchronized. In each round:unlimited local computation and communication with neighbors. Complexity:of rounds to 0 terminate in the worst case. In t rounds:each node can collect information up to distance t. PLOCAL:t=polylog(n)
Local Computation • Communications are synchronized. • In each round: unlimited local computation and communication with neighbors. • Complexity: # of rounds to terminate in the worst case. • In t rounds: each node can collect information up to distance t. the LOCAL model [Linial ’87]: “What can be computed locally?” [Noar, Stockmeyer, STOC’93, SICOMP’95] PLOCAL: t = polylog(n)

"What can be sampled locally?" Joint distribution defined by local constraints: ●Markov random field ●Graphical model .Sample a random solution from the joint distribution: distributed algorithms (in the LOCAC model) network G(,E) Q:"What locally definable joint distributions are locally sample-able?
“What can be sampled locally?” network G(V,E) • Joint distribution defined by local constraints: • Markov random field • Graphical model • Sample a random solution from the joint distribution: • distributed algorithms (in the LOCAL model) Q: “What locally definable joint distributions are locally sample-able?

MCMC Sampling Classic MCMC sampling: GV,E) Markoy chain X→X+l: pick a uniform random vertex v; update X(v)conditioning on X(N(v)); O(n log n)time when mixing Parallelization: Chromatic scheduler [folklore][Gonzalez et al.,AISTAT'11]: Vertices in the same color class are updated in parallel. ●O△logn)mixing time(△is max degree) "Hogwild!"[Niu,Recht,Re.Wright,NIPS'11][De Sa,Olukotun,Re,ICML'16]: All vertices are updated in parallel,ignoring concurrency issues. ●Wrong distribution!
MCMC Sampling G(V,E): v Classic MCMC sampling: Parallelization: • Chromatic scheduler [folklore] [Gonzalez et al., AISTAT’11]: Vertices in the same color class are updated in parallel. • “Hogwild!” [Niu, Recht, Ré, Wright, NIPS’11][De Sa, Olukotun, Ré, ICML’16]: All vertices are updated in parallel, ignoring concurrency issues. pick a uniform random vertex v; update X(v) conditioning on X(N(v)); Markov chain Xt → Xt+1 : O(n log n) time when mixing • O(Δ log n) mixing time (Δ is max degree) • Wrong distribution!
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 《计算机科学》相关教学资源(参考文献)Dynamic Sampling from Graphical Models.pdf
- 《计算机科学》相关教学资源(参考文献)Distributed Algorithms for MCMC Sampling.pdf
- 《计算机科学》相关教学资源(参考文献)Fast Sampling Constraint Satisfaction Solutions via the Lovász Local Lemma.pdf
- 《计算机科学》相关教学资源(参考文献)Dynamic and Distributed Algorithms for Sampling from Gibbs Distributions.pdf
- 《计算机科学》相关教学资源(参考文献)Ranged hash functions and the price of churn.pdf
- 《计算机科学》相关教学资源(参考文献)Fast construction of overlay networks.pdf
- 《计算机科学》相关教学资源(参考文献)Cell-Probe Proofs.pdf
- 《计算机科学》相关教学资源(参考文献)Cell-probe proofs and nondeterministic cell-probe complexity.pdf
- 《计算机科学》相关教学资源(参考文献)Improved FPTAS for Multi-Spin Systems.pdf
- 《计算机科学》相关教学资源(参考文献)Correlation Decay up to Uniqueness in Spin Systems.pdf
- 《计算机科学》相关教学资源(参考文献)Assigning Tasks for Efficiency in Hadoop.pdf
- 《计算机科学》相关教学资源(参考文献)Approximate Counting via Correlation Decay on Planar Graphs.pdf
- 《计算机科学》相关教学资源(参考文献)Approximate Counting via Correlation Decay in Spin Systems.pdf
- 《计算机科学》相关教学资源(参考文献)Spatial Mixing of Coloring Random Graphs.pdf
- 《计算机科学》相关教学资源(参考文献)Spatial mixing and the connective constant - Optimal bounds.pdf
- 《计算机科学》相关教学资源(参考文献)Simple average-case lower bounds for approximate near-neighbor from isoperimetric inequalities.pdf
- 《计算机科学》相关教学资源(参考文献)Counting hypergraph matchings up to uniqueness threshold.pdf
- 《计算机科学》相关教学资源(参考文献)Convergence of MCMC and Loopy BP in the Tree Uniqueness Region for the Hard-Core Model.pdf
- 《计算机科学》相关教学资源(参考文献)What can be sampled locally?.pdf
- 《计算机科学》相关教学资源(参考文献)On Local Distributed Sampling and Counting.pdf
- 《计算机科学》相关教学资源(参考文献)Introduction to the Correlation Decay Method.pdf
- 《计算机科学》相关教学资源(参考文献)Local Distributed Sampling from Locally-Defined Distribution.pdf
- 《计算机科学》相关教学资源(参考文献)Rectangle Inequalities for Data Structure Lower Bounds.pdf
- 《计算机科学》相关教学资源(参考文献)Sampling up to the Uniqueness Threshold.pdf
- 《计算机科学》相关教学资源(参考文献)What can be sampled locally?.pdf
- 《计算机科学》相关教学资源(参考文献)Correlation Decay up to Uniqueness in Spin Systems.pdf
- 《计算机科学》相关教学资源(参考文献)Phase transition of hypergraph matchings.pdf
- 《计算机科学》相关教学资源:Beautiful Journey of Theoretical Computer Science(理论计算机科学的美丽旅程).pdf
- 《计算机科学》相关教学资源:Quest for Artificial Intelligence(人工智能探秘).pdf
- 《计算机科学》相关教学资源:The Magical Wild Animals(神奇的动物).pdf
- 《计算机科学》相关教学资源(参考文献)Counting with Bounded Treewidth.pdf
- 《计算机科学》相关教学资源(参考文献)Decay of Correlation in Spin Systems.pdf
- 《计算机科学》相关教学资源(参考文献)POMP:Protocol Oblivious SDN Programming with Automatic Multi-Table Pipelining.pdf
- 《计算机科学》相关教学资源(参考文献)Progress of Concurrent Objects with Partial Methods.pdf
- 《计算机科学》相关教学资源(参考文献)A Practical Verification Framework for Preemptive OS Kernels.pdf
- 《计算机科学》相关教学资源(参考文献)A Program Logic for Concurrent Objects under Fair Scheduling.pdf
- 《计算机科学》相关教学资源(参考文献)Compositional Verification of Termination-Preserving Refinement of Concurrent Programs.pdf
- 《计算机科学》相关教学资源(参考文献)Rely-Guarantee-Based Simulation for Compositional Verification of Concurrent Program Transformations.pdf
- 《计算机科学》相关教学资源(参考文献)Characterizing Progress Properties of Concurrent Objects via Contextual Refinements.pdf
- 《计算机科学》相关教学资源(参考文献)Modular Verification of Linearizability with Non-Fixed Linearization Points.pdf