Documented here, the issue occurs due to mismatched settings between the virtual hard drive descriptor and data files. This was caused by a VMDK file deleted by VirtualBox (presumed) and mirculously retrieved using TOKIWA data recovery. The restored file was uncorrupted, but unrecognized by VMWare Player. The solution was to unregister the VM and register it anew. Blessed be the goddess.
Wednesday, November 26, 2014
Ugly hack - Elasticsearch plugin (Knapsack) install on Windows
- Under Cygwin, with a mix of POSIX and Windows paths
- Firewalled ES node -- downloaded to RDP client and pulled binary via tsclient
- ES server is embedded in a proprietary stack
$ 'C:\Program Files\ObscureDistro\java/bin/java' -Xmx64m -Xms16m -Delasticsearch '-Des.path.home=/cygdrive/c/Program Files/ObscureDistro/server/bin' -cp 'C:\Program Files\ObscureDistro\server\lib.obscure\*' org.elasticsearch.plugins.PluginManager -install knapsack -url 'file:///\\tsclient\C\Users\someone\Downloads\elasticsearch-knapsack-1.3.2.0-plugin.zip'
-> Installing knapsack...
Trying file://///tsclient/C/Users/someone/Downloads/elasticsearch-knapsack-1.3.2.0-plugin.zip...
Downloading .....DONE
Installed knapsack into C:\cygdrive\c\Program Files\ObscureDistro\server\bin\plugins\knapsack
- ES node must be restarted
Tuesday, November 25, 2014
Using strace
Trace system calls from a process and all its children and threads:
sudo strace -f -p 3914 2>&1 | grep -vE 'clock_gettime|SIGSTOP|gettime|epoll|futex|restart' | head -10000 | less
-f: track forked
-p: parent PID
grep: remove fast, unimportant calls
Thursday, November 13, 2014
Tuesday, November 11, 2014
Elasticsearch - useful cats
http://myhost:9200/_cat/thread_pool?v
Shows:
host ip bulk.active bulk.queue bulk.rejected index.active index.queue index.rejected search.active search.queue search.rejected
Sunday, November 09, 2014
Friday, November 07, 2014
Elasticsearch multilevel aggregation
/*
SELECT count(*)
FROM docs
GROUP BY storm_data_spout.task_id
UNION
SELECT count(*)
FROM docs
GROUP BY storm_data_bolt.task_id
*/
{
"query": {
"match_all": {}
},
"aggs": {
"bolt": {
"terms": {
"field": "storm_data_spout.task_id"
}
},
"spout": {
"terms": {
"field": "storm_data_bolt.task_id"
}
}
}
}
// ======================
/*
SELECT count(*)
FROM docs
GROUP BY storm_data_spout.task_id, storm_data_bolt.task_id
-- embedded agg not supported for multilevel using terms agg. Using script workaround per http://bit.ly/1uI76eO
*/
{
"query": {
"match_all": {}
},
"aggs": {
"spout-bolt": {
"terms": {
"script": "doc['storm_data_spout.task_id'].getValues() + '|' + doc['storm_data_bolt.task_id'].getValues()"
}
}
}
}
SELECT count(*)
FROM docs
GROUP BY storm_data_spout.task_id
UNION
SELECT count(*)
FROM docs
GROUP BY storm_data_bolt.task_id
*/
{
"query": {
"match_all": {}
},
"aggs": {
"bolt": {
"terms": {
"field": "storm_data_spout.task_id"
}
},
"spout": {
"terms": {
"field": "storm_data_bolt.task_id"
}
}
}
}
// ======================
/*
SELECT count(*)
FROM docs
GROUP BY storm_data_spout.task_id, storm_data_bolt.task_id
-- embedded agg not supported for multilevel using terms agg. Using script workaround per http://bit.ly/1uI76eO
*/
{
"query": {
"match_all": {}
},
"aggs": {
"spout-bolt": {
"terms": {
"script": "doc['storm_data_spout.task_id'].getValues() + '|' + doc['storm_data_bolt.task_id'].getValues()"
}
}
}
}
Thursday, November 06, 2014
rsync
Synchronize directory trees incrementally (only newer files get pushed)
rsync -vazh ~/git/myproj --exclude 'node_modules/' --exclude '.git/' --exclude '.idea/' myser@dev01:~/git
-a: archive (preserve timestamps/permissions)
-v: verbose
-h: human-readable output
-z: compress
-u: only new(er) files
--exclude: self-explanatory. Be sure to list a separate instances for every excluded path
It is important to include the trailing slash in the source path. That instructs rsync to copy the content of that directory into the destination path. Omitting the trailing slash will create the referenced directory in the destination (e.g ~/git/myproject/src/src/)
Subscribe to:
Posts (Atom)